of 


he 


Journal of 


Experimental Psychology 


VoL. 37, No. 3 JUNE, 1947 


REACTIVELY HOMOGENEOUS COMPOUND TRIAL-AND- 
ERROR LEARNING WITH DISTRIBUTED 
TRIALS AND TERMINAL 
REINFORCEMENT ! 


BY ALLEN J. SPROW 


Department of Psychology, 
Institute of Human Relations, 
Yale University 


INTRODUCTICN 


Although studies (2, 7) of heterogeneous compound trial-and- 
error learning with terminal reinforcement? have been completed, 
no comparable experiment of the related and simpler homogeneous 
compound trial-and-error learning has been reported. In a four- 
choice-point linear maze with four choices at each choice point, Hill 
(2) and Hull (7) both found that in the case of heterogeneous com- 
pound trial-and-error learning the ends of the maze, i.e., the first and 
fourth choice points, were easier for the animals to learn than was 
either of the two intermediate choice points. ‘These results in 
superficial disagreement with the goal gradient principle (3) were 
interpreted as being due to the convergence toward the middle of the 
series of the competitional interference by the other (heterogeneous) 
reactions being learned at the same time. 


1 This investigation is part of the coordinated research program of the Institute of Human 
Relations, Yale University. The writer is greatly indebted to Professor Clark L. Hull, under 
whose direction the experiment was carried out, for valuable advice and aid of many kinds. 
Mr. Harry G. Yamaguchi derived the equations and fitted the curves in Figs. 4 and 5. 

2? Homogeneous compound trial-and-error learning is differentiated from heterogeneous 
compound trial-and-error learning in that the former requires the same reaction at each choice 
point while the latter requires a different reaction at each choice point. Terminal reinforcement 
occurs by providing a reward, e.g., food, only following the correct reaction to the /ast of a series 
of acts. The present article is the third of a series involving various combinations of the above 
variants in compound trial-and-error learning (7) and chain reactions (1) now completed in the 
laboratories of the Institute of Human Relations. 
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In the case of homogeneous compound trial-and-error learning, 
Hull (8, Chap. VII) deduced that these generalized tendencies would 
facilitate rather than oppose each other and when learning trials are 
broken down so that errors in the early, intermediate, and final 
stages of learning were isolated, the following results would obtain: 


1. At the very first trial, the maximum error probability would 
fall at choice point I and the minimum error probability at the choice 
point closest the point of reinforcement, i.e., choice point IV, mainly 
in accordance with the principle of the transference of learning from 
the earlier correct choices to the later ones during that trial. 

2. In the final stages of learning: 


A. The maximum error probability would fall at choice point I 
as would be expected were the goal gradient uncomplicated by other 
considerations. 

B. The minimum error probability would fall at choice point 
III, as would not be expected in accordance with the uncomplicated 
goal gradient. 

C. The error gradient would fall from a maximum at choice 
point I to a minimum at choice point III with a progressively slowing 
rate, after which it would rise at choice point IV. 


3. In the intermediate stages of learning, the results would lie 
between those of the early and final stages of learning. 


The present study was undertaken to investigate empirically the 
above theoretical expectations. 


APPARATUS AND PROCEDURE 


The apparatus used in this investigation is identical with that used by Hull (7) and similar 
to that used by Hill (2). A diagram of the floor plan is shown in Fig. 1. Four choice points 
with four choices or valve-like doors at each choice point, divided the runway, which was 12 
feet long, one foot wide, and 6.5 in. high. Each section was covered with a hinged, wire-mesh 
frame. The doors were hinged at the top and sloped at an angle of 60° so that when unblocked 


P I pig 


Fic. 1. Diagrammatic representation of the floor plan of the maze used in the present 
study. The animals were placed in the maze at S. P represents the partition in each section 
of the maze with a 2.5-in. passageway in its center forcing the animals to make their choices of 
doors at choice points I, II, III, and IV from a comparable position in the runway. The animals 
were fed upon reaching G. The dotted line represents the correct pathway through the maze 
for an animal whose learning task was to choose door no. 1 at each choice point between S and 
G and to make the same reaction (homogeneous) at each choice point. Analogous pathways 
were followed by animals whose task was to learn to choose doors no. 2, 3, and 4 respectively. 
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the pressure of the animal’s head could lift them in a goalward direction. Since the doors could 
not be forced open from the other direction, retracing was impossible. A 25-watt frosted electric 
light bulb, shielded by a 3.5-in. cubic box, was suspended 10.5 in. above each dividing passageway. 
The only other illumination in the experimental room was provided by the large central lighting 
fixture. 

The animals used in this experiment were male albino rats purchased from the Albino Farms, 
Redbank, New Jersey. They had had no previous laboratory experimental experience. In 
addition to the four pellets received in the maze, three Nurishmix dog biscuits were fed them 
when they were placed in their individual living cages upon completion of their daily run in the 
maze. 

In order to habituate them to the apparatus, the animals, which had not been fed for 234 
hours, were put into the maze at S and trained to go from the anterior end to the posterior end, 
G, all the valves being raised to permit free passageway. Here the animals found four pellets 
of food made from a mash of pulverized dog-chow biscuit, wheat flour, and water. The pellets, 
which were approximately } in. in diameter, lay on a 3 X 5-in. piece of white paper. This 
habituation was continued until the animals would run through the maze without hesitation. 
On the following day, training was begun with all doors lowered but none locked. A record was 
made of the door chosen at each choice point on these latter trials. Five trials per day were 
given each animal during habituation training. As soon as the animals would run through the 
maze under these conditions without hesitation, the experiment proper was begun. 

At the conclusion of the preliminary habituation, the animals were divided, each rat being 
assigned to a door as far as possible from the one preferred during the preliminary training; this 
was done with the expectation that when all doors but one at each choice point were locked, some 
new learning would necessarily take place. The animals were then trained to run from the 
anterior end to the posterior end of the maze by choosing the same door and making the same 
reaction (homogeneous) at each of the four choice points. Errors and correct choices were re- 
corded electrically by signal markers on waxed polygraph paper which was forced past the 
markers at the rate of o.1 in. per sec. by a constant-speed Telechron motor. To avoid the 
possibility of an animal using as a cue the track of the rat previously run, the Ss were run in 
different, pre-arranged sequences so that each sequence was always preceded by a different one. 
The animals were run once each day, at the same time of the day, six days per week, until they 
had made no error for 10 consecutive experimental sessions. 


Main RESULTS 


In order to secure data from the present experiment which would 
be comparable to those reported in the most closely analogous 
previous investigation (10), the average of the total errors made by 
the 40 animals at each choice point was calculated. ‘This is pre- 
sented graphically in Fig. 2. It will be seen that the maximum 
mean number of errors occurs at choice point I; the minimum at 
choice point III, between the middle and the posterior end of the 
runway; and the decrement from choice point I to choice point III 
is decelerated. The tilt-up at choice point IV, which is the critical 
point of this figure, is in accordance with theoretical expectation. 

Table I gives the tabulation of the mean number of total errors 
at the several choice points together with standard deviations and 
standard errors. Although the animals sometimes pushed the same 
door a number of times in succession at a rate as rapid as several per 
sec., each such attempt was not recorded as a discrete error. ‘These 
rapid, successive pushes were not recorded as separate errors unless 
Io sec. with no further pushes or an attempt at another door inter- 
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MEAN NUMBER OF ERRORS 


CHOICE POINTS 


Fic. 2. Graphic representation of the mean number of total errors made by 40 animals 


at the several choice points during learning. The graph is plotted from data given in column 1 
of Table I. 


TABLE I 


Data on THE MEAN NuMBER OF ToTAL Errors MApE By 40 ANIMALS AT THE SEVERAL CHOICE 
Points Durinc LEARNING TRIALS 


Mean o wa” r Diff. oMi-M2* CR. P 
I 10.33 7.49 1.18 | I, Il 50 3.68 1.08 3-41 1.00— 
II 6.65 5.91 94 | 1, .29 5.08 1.17 4-34 1.00— 
Ill §.25 3.90 62 4.88 1.24 3.94 1.00— 
IV 5-45 4.93 78 =| Ii, Ill .72 1.40 64 2.19 99+ 

Il, IV .47 1.20 .89 1.35 

Ill, 1V 64 .20 .62 63 


* The standard error of the differences between means was calculated by use of the formula: 


f 
= + m2 — 27 M2. 


vened. ‘This criterion, which was deemed adequate to indicate 
discrete and unrelated errors at a given door, is the same as that 
adopted by Hill (2) in his study already mentioned. The product- 
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moment correlation coefficients between the mean number of total 
errors made at the several choice points, the critical ratios of the 
differences between the means, and their statistical reliabilities are 
also presented in Table I. Those differences between choice points I 
and II and between II and III are in the expected direction and reason- 
ably satisfactory, but the critical difference, that between III and IV, 
while in the theoretically expected direction does not have a satis- 
factory reliability. 

To show the operation of a special factor in early and inter- 
mediate but not in final stages of learning, the mean first-choice 
errors during the first one-fifth, the next two-fifths, and the final 
two-fifths of learning trials were calculated, the final trial being 
taken as that in which the last error occurred. This division of trials 
was carried out regardless of whether a given rat learned rapidly or 
slowly. These means are shown graphically in Fig. 3. There it 
will be noted that in the early stages of learning, i.e., the first one- 
fifth of the learning trials, the mean number of first-choice errors falls 
from a maximum at choice point [ to a minimum at choice point IV. 
This is what would be expected when the transfer from early choice 
points to later ones in the same trial is uncomplicated by another 
type of transfer to be discussed presently. 

An examination of the graph representing the /ast two-fifths of 
the learning trials shows that the maximum mean number of errors 
falls at choice point I, the minimum falls at choice point III, and the 
theoretically expected upward tilt at choice point IV is now very 
marked. 

Inspection of the dotted-line graph in Fig. 3 shows that for the 
intermediate stages of learning, there is a transition taking place 
from the situation in which the relatively uncomplicated forward 
transfer within the series occurs as in the early stages of learning to 
the situation in which the two types of stimulus generalization are 
superimposed on the goal gradient as in the advanced stages of 
learning. 

In Table II it can be seen that while only the difference in mean 
number of first-choice errors between choice points I and II is 
significant at the one percent level, the difference between III and 


IV, the critical one, approaches significance at the 10 percent level 
but does not quite reach it. 


Some Accessory RESULTS 


Often the results of an experiment throw light on problems not 
primarily involved in the original plan. The present study is no 
exception. It gives suggestive indication of the molar laws of the 
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relationship of: (1) the number of errors to the number of trials; (2) 
the reaction potential to the number of trials; and (3) the reaction 
potential to the number of errors. Special analysis of the data is 
necessary in each case to demonstrate in a quantitative manner these 
presumptively lawful relationships. 
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Fic. 3. Graphic representation of the mean number of first-choice errors at the several 
choice points during the first one-fifth, the next two-fifths, and the final two-fifths of learning 
trials. The means are based on first-choice errors made by 40 animals. 


We shall begin with the relationship of the number of errors (R) 
to the number of trials (V). Following a technique described by 
Hull (4, p. 245, footnote 4), the total numbers of errors made by each 
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TABLE II 


DaTa ON THE Mean NuMBER OF First—Cuoice Errors MApE AT THE SEVERAL 
Cuoice Points By 40 AniMALS Durinc THE Last Two-FiFTHs 
OF THE LEARNING TRIALS 


The standard error of the differences between means was calculated by the same methods 
as were used with Table I. 


ho Pairs of 
Mean oM Diff. oM1-M?2 C.R. 
I 1.305 1.064 168 | I, Il .290 .460 .205 2.24 .99 
II 845 1.121 177. | I, lll —.227 585 .228 2.57 .99 
Ill .720 777 123 | I,1V .096 335 1.14 .87 
IV .970 1.633 258 | Il, lll 399 125 .170 74 77 
II, IV 735 125 .176 71 .76 
III, IV 541 1.14 .87 


animal at all choice points were first converted into justly weighted 
Vincent values to equalize the differing rates of learning. This was 
done by dividing the total number of errors each animal made during 
learning trials into the equivalent of a record of eight learning trials, 
the median number of actual learning trials up to and including the 
last trial in which anerror was made. Means of these justly weighted 
Vincent values were then computed. ‘These mean numbers of errors 
per trial are listed in column 2 of Table III. They give a rather 


TABLE III 


Tue CuMuLATIVE REINFORCEMENTS, CALCULATED Mean ViNcENT Errors, 
AND REAcTION POTENTIALS 


Cumulative Preceding Mean Number of Vincent Reaction Potential 
Reinforcements (N) Errors per Trial (R) (sErR) 
8.10 .0O 
I 5-74 .60 
2 4.64 1.20 
3 3.85 1.60 
4 2.65 2.10 
5 2.01 2.23 
6 1.46 2.72 
7* 1.76* * 


*The sEpr value is not given for N = 7; it is meaningless because of the well-known dis- 
tortion of the ‘end spurt’ in the value of the Vincent #’s. The ‘end spurt’ expectation is con- 
firmed by the fact that R rises at this point from 1.46 to 1.76. 


coarse indication of the functional measure of the errors in the maze 
as a whole only, inasmuch as the special summative effects and rates 
of learning at the several choice points are not differentiated. These 
values are represented by black dots in Fig. 4 as a function of the 
number of trials. An equation was fitted to these two sets of values. 
It is 


R = 8.2-107129, (1) 
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The smooth curve running among the black dots in Fig. 4 has been 
plotted from values calculated by substituting N-values in this 
equation. ‘The proximity of the data points to the fitted curve shows 
that the equation is a fair first approximation to the law relating R 
to N. The relationship in question accordingly turns out to be an 
exponential decay function. 

We proceed next to the consideration of the relationship of 
reaction potential (s£) to the number of trials (V). To do this, 
it is necessary first of all to determine the values of sEr in a quantita- 
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Fic. 4. Graphic representation of mean number of errors per Vincent trial (R) 
plotted as a function of the cumulative preceding reinforcements (V) 


tive manner, which must be done by utilizing a procedure not 
previously published. Our basic data upon which this operation 
depends are the above mentioned Vincentized errors made by the 40 
individual animals at the different trials. Inasmuch as errors 
decrease as learning increases, there is the necessity of converting the 
errors (R) made by the Ss into reaction potentials (sEr). Because 
of the inverse relationship and the impossibility of using one as a 
direct measure of the other, a special procedure must be devised for 
this. A research memorandum by Hull (6, p. 4 ff.) proposed a 
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technique for calculating the reaction potentials as a function of the 
number of trials, from reaction latency data, and it has been utilized 
by Hull, Felsinger, Gladstone, and Yamaguchi (9). This technique 
has been roughly adapted by us to the derivation of reaction poten- 
tial as a function of the number of trials, from the errors (R) made 
during the learning process. 

The values of each individual animal’s error records obtained by 
the justly weighted Vincent process described above were compared 
with one another in a manner analogous to that employed by Thorn- 
dike (11) in developing a scale for handwriting: Each animal’s 
Vincentized score on the first trial was compared with that on the 
second, the second with the third, the third with the fourth, and so 
on much as was done with Thorndike’s handwriting samples. When 
the first of a pair being compared was greater than the second, a 
value of one was assigned; when the first was less than the second, 
a value of zero was assigned; when the two were equal, as occasionally 
happened, a value of 0.5 was assigned. These values were placed 
in a table having the same number of rows as there were animals, 
i.e., 40, and one column fewer than the number of Vincentized trials, 
i.e, seven. The values in each column were then averaged, the 
resulting means representing the empirical probability that the 
second trial has fewer errors than the first, the third fewer than the 
second, and so on. The final comparison of the series of Vincent 
values was omitted from the calculations in order to prevent the 
well-known end-spurt artifact, characteristic of Vincent curves, from 
distorting the learning curve. ‘These paired comparisons based on 
learning errors were then converted into scale-value differences by 
the use of an appropriate table of the normal probability integral 
as usual in scaling procedures based upon the methods of psycho- 
physics. The unit of measure of these differences is the standard 
deviation (¢) of the S’s tendency to error variability from trial to 
trial. These differences were combined cumulatively to secure the 
successive values of reaction potential (sEx) which result from the 
successive trials, that at trial one being arbitrarily taken as zero. 

The sEp’s thus calculated are given in column 3 of Table IIT. 
The relationship to N is shown graphically by the dots in Fig. 5. 
A first approximation to the relationship represented by these data 
is presumably given by the equation fitted to them which proved to 


be a power function: 
sEr = .65 (N)°*. (3) 


The smooth curve running among the empirical sr dots in Fig. 5 
was plotted from values obtained by substituting successive N’s in 
this equation. An extrapolation of this curve to N = 7, the logical 
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completion of the learning, yields an sip value of approximately 3¢, 
which suggests the total range of this particular learning process. 
Finally, we proceed to the analysis of the present experimental 
data to secure a rough first approximation to the presumptive molar 
functional relationship, sEr = f(R). Since both equations (1) and 
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Fic. 5. Graphic representation of reaction potential (sEr) plotted as a power 
function of the cumulative preceding reinforcements (NV) 


(3) contain N as one of their variables, this relationship may be de- 
duced from our previous two determinations. We have already seen 
that 


R = (1) 


Transposing and simplifying, we have 


_ log 8.2 — log R (2) 
.123 


N 
We have also seen that 


sEr = Sgn. (3) 
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Accordingly, substituting the value of N as it appears in (2) into (3) 
and simplifying, we have 


sEr = 3.48(.914 — log R)*, (4) 


which is presumably a first approximation to the mathematical 
expression of the relationship we have been seeking. 


DIscuUSSION 


As stated above, the first-choice errors of the present study 
were divided in such a way as to isolate those errors made in the early, 
intermediate, and final stages of learning as illustrated graphically 
in Fig. 3. These results give empirical support to Hull’s Corollary 
XXX (8, Chap. VIII) which states that the very first trial will 
display a progressive mean reduction in the percent of erroneous 
reaction from the first to the last link in a series. Because of the 
identity of the correct (reinforced) responses and the external stimuli 
involved at the choice points and the fact that stimulus generalization 
(as shown by Hovland, Junius Brown, and others) is practically 
perfect at the beginning of reinforcement, it is to be expected that 
there will be immediate transfer of a perseverative nature during the 
early trials from each choice point to all subsequent ones, as was 
actually the case in this experiment. 

Further empirical support of Corollary XXX is implicit in an 
investigation reported by Warden and Cummings (12). They used 
a series of simple linear mazes with alternating right and left choices 
required. After ‘orienting’ albino rats by feeding them in the 
entrance and food boxes, the Es gave the animals slightly massed 
trials, i.e., three per day in immediate succession, during the experi- 
ment proper. Now, an alternation maze of this type may be con- 
sidered as made up of two homogeneous subseries inasmuch as the 
A-acts (e.g., right choices) all involve the same movements and 
practically the same stimuli, and the B-acts (e.g., left choices) like- 
wise all involve the same movement and practically the same stimull, 
both the stimuli and the movements of the A-acts being largely 
distinct from those of the B-acts. When corresponding pairs of A 
and B choice points are sorted out in parallel and pooled, the mean 
number of errors on the initial trial (less retracing errors) of 27 
animals for the first six choices of such paired parts of the maze was 
found by Hull (8, Chap. VIII) from data reported by Warden and 
Cummings in their Table II (12, p. 248) to be as follows: 


Choice points............ I+ Il III + IV V+VI 
Mean number of errors... .73 .26 17 
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These results so far as they go not only are in exact conformity with 
Hull’s Corollary XXX but also agree closely with the upper curve 
in Fig. 3. 

In the present study, empirical support was provided for the 
three major deductions, A, B, and C, formulated by Hull (8, Chap. 
VII) and stated in the present introduction, namely, at a late stage 
of learning the maximum error probability of a four-choice-point 
homogeneous compound trial-and-error learning situation falls at 
choice point I; the minimum error probability falls at choice point 
III; and the error gradient descends from choice point I to choice 
point III by a progressively slowing rate. It would be expected that 
by the operation of the goal gradient unmodified by generalization 
there would be more errors at choice point III than at choice point 
IV. Actually, in the present study at a late stage of training the 
minimum mean number of errors fell at choice point III and there 
was a tilt-up in the error gradient at choice point IV. Thus, em- 
pirical generalizations are completely in harmony with theoretical 
expectation, though unfortunately at the critical point (III, IV) the 
reliability is low and this taken alone leaves the relationship in 
uncertainty. lowever, uncertainties of this sort can be remedied by 
the addition of pertinent supplementary evidence. Fortunately, a 
considerable amount of such evidence is available, though its re- 
levance to the present problem has not hitherto been pointed out in 
published form. 

In a linear maze study involving six consecutive homogeneous 
choice points, Montpellier (10) used a total of 42 animals, giving 
them one trial per day. His data show that the mean number of 
total errors made at each choice point decreased from the first to the 
fourth choice points, after which there was a consistent rise at choice 
points 5 and 6. The weighted averages of the mean number of 
errors at each choice point (calculated from Montpellier’s published 
data, 10, Table I, p. 127) are represented graphically in Fig. 6. 
Thus the tilt-up in the mean error probability found in the present 
study is also found in the exactly analogous Montpellier data, 
though Montpellier himself quite missed the theoretical significance 
of this tilt-up in his results. _ 

Turning again to the simple alternation type of maze with its 
two homogeneous A- and B-subsections as already analyzed, we find 
that if the two subseries are isolated, the errors averaged and com- 
bined in parallel, the results should be comparable with our own as 
plotted in Fig. 2. This has been done for three of the sets of maze 
data, taken from 9, 10, and 8 animals respectively, published by 
Warden and Cummings (12, p. 249) already referred to. The 
results are shown in Table IV. A glance at this table shows that 
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in no instance does the minimal difference fall at the posterior end 
of the subsections of the mazes, or anterior to the middle of the sub- 
sections. These three experiments accordingly give substantial 
further demonstration of the tilt-up of error probability in homo- 
geneous compound trial-and-error learning. 
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Fic. 6. The mean number of total errors made by 42 blinded albino rats at each choice 
point of a six-choice-point linear maze. Calculated from empirical data contained in Mont- 
pellier’s Table I (10, p. 127). 


We now come to the last of this series of supporting experiments. 
Arnold (1) in an unpublished Ph.D. thesis has found that in the case 
of a simple four-link homogeneous reaction chain this same minimum 


210 


ALLEN J. SPROW 


TABLE IV 


Tue DIFFERENCE BETWEEN THE MEAN Error Scores FoR ALBINO Rats To MASTER 


THE PARALLEL A B Links oF ALTERNATION Mazes 


oF Various LENGTHS 
Calculated by Hull (8, Chapter VIII) from results published by Warden and Cummings 


(12, p. 249). 
Order of Links in A- and B-subsections of Maze 
Maze Length 
I 2 3 a 
6 links 2.46 .88 1.34 
8 links . 5.80 .60 —.90 .70 
10 links 5.00 2.88 225 1.63 2.75 


learning difficulty occurs at the third reaction point and that the 
same tilt-up in learning difficulty occurs at the fourth reaction point. 
Thus we have a total of five independent comparable experiments 
which support the present findings. We conclude that even though 
the statistical reliability of our own results regarding the error tilt-up 
at choice point IV in four-choice-point homogeneous compound 
trial-and-error learning is low, it is almost certainly a genuine 
phenomenon. 

Finally, let us compare homogeneous with heterogeneous com- 
pound trial-and-error learning, with terminal primary reinforcement, 
particularly in the manner that each deviates from the uncomplicated 
goal gradient (5, p. 135), and consider why such characteristic 
deviations occur. Whatever may turn out to be the exact quantita- 
tive characteristics of the goal gradient, this much seems clear at 
present: its maximum strength falls at the point of primary rein- 
forcement, and it decreases in strength with a progressively slowing 
rate as the acts in question grow more remote from that point. 
Moreover, as the reinforcements, both positive and negative, increase 
with added learning, the generalization gradients change from a 
practical horizontal to a progressively steepening negative growth 
function (5, p. 183). Also, in the linear maze both Hill (2) and Hull 
(7) have found that the steepness of the generalization gradient from 
the rear forward is not nearly so great as is that from the front of 
the maze toward the rear. But, since the two middle choice points 
receive generalized tendencies from two directions with only one 
step of reduction, whereas the two end choice points receive only one 
generalization of such strength, it follows that the two middle choice 
points (II and III) will receive more generalized tendencies than the 
two end ones. And, since the generalization gradient falls less 
steeply from choice point IV, choice point III will receive a consider- 
ably greater generalized tendency than will choice point II. 
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Now, in the heterogeneous form, the generalized reaction ten- 
dencies will appear at other choice points as errors; this reduces the 
number of first-choice correct reactions. In the homogeneous form, 
however, since the generalized reactions are the same as those already 
at the pcints in question, the effect of the generalization will result 
in summation rather than interference. It follows that in the 
heterogeneous form the error distortion will deviate, particularly at 
choice point III, from that characteristic of the goal gradient in that 
the errors are increased, whereas in the homogeneous form the error 
distortion will deviate in that the errors are decreased. 

Thus, in the heterogeneous experiments of Hill (2) and Hull (7) 
the number of first-choice errors at choice point III deviates from the 
characteristics of the goal gradient in being markedly greater than at 
choice point 1. On the other hand, in the present homogeneous study 
the characteristic deviation from the goal gradient is that the number 
of errors at choice point III is markedly less than at choice point IV. 
Accordingly, the two types of experiments differ radically and 
characteristically in outcome exactly as is to be expected from the 
differing conditions. ‘Thus each type of learning, by conforming to 
the theoretical principles involved, confirms and supports the 
quantitative experimental findings of the other. 


SUMMARY 


In a four-choice-point linear maze with four doors at each choice 
point, 40 albino rats were trained to choose the same door at each 
choice point (homogeneous form of compound trial-and-error learn- 
ing) for food reinforcement at the posterior end of the maze. ‘The 
rats were given one trial per day until 10 consecutive trials without 
error had been run. Correct and incorrect choices were recorded 
electrically on a waxed-paper polygraph. 

The results of the present study may be summarized as follows: 


1. When the first-choice errors at the several choice points are 
separated in such a way as to divide the learning trials into early, 
intermediate, and final stages, the mean number of first-choice 
errors in the early stages of learning falls from a maximum at choice 
point I by a progressively slowing rate to a minimum at choice 
point IV. 

2. During the final stages of learning, the mean number of first- 
choice errors falls from a maximum at choice point I by a progress- 
ively slowing rate to choice point III, after which it rises sharply at 
choice point IV. 

3. Clearly intermediate between these extremes is the gradient 
which occurs at the intermediate stage of learning. 
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4. When the total errors of all three stages of learning are pooled 
as in Fig. 2, the characteristic tilt-up of errors at choice point IV is 
considerably blurred but still in evidence. 

5- Clearly supporting results of this tilt-up of learning difficulty 
in the goal gradient situation are found consistently in five analogous 
homogeneous compound trial-and-error learning experiments already 
reported in the literature. 

6. Theoretically, it may be deduced that: 


A. As in the heterogeneous form of compound trial-and-error 
learning, the two stimulus generalization gradients here also act in 
such a way as to accumulate, especially on the two central choice 
points. 

B. Since in the present investigation the generalized reactions 
are homogeneous, they summate and the accumulation increases the 
correct reactions (reduces the errors) at the two central choice 
points beyond what would be yielded by the uncomplicated goal 
gradient. | 

C. As a result, in homogeneous compound trial-and-error learn- 
ing, the maximum decrease in mean errors below what would result 
from the uncomplicated goal gradient should occur between the 
middle and the posterior end of the maze exactly as it does in fact. 
rs’ An analysis of the accessory data secured primarily for the de- 
termination of the relationship of homogeneous compound trial-and- 
error learning to the goal gradient leads to a preliminary deter- 
mination of three presumably basic molar relationships. 


7. The number of errors in the present type of situation appears 
as a first approximation to follow the exponential decay law, 


R = 


8. The reaction potential, calculated from the individual Vincent 
error scores, appears as a first approximation to conform to the power 


function, 
sEr = 65 (N)°8. 


g. There is a suggestion that the total range of this particular 
learning process is 30 of the variability of the tendency of these 
animals to make this type of error. 

10. The relationship of the number of errors to reaction potential 
derived from the two preceding equations seems to take a logarithmic 
form, 


sEr = 3.48 (.914 — log R):8. 


(Manuscript received July 5, 1946) 
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REACTION LATENCY (str) AS A FUNCTION OF THE 
NUMBER OF REINFORCEMENTS (1)! 


BY JOHN M. FELSINGER, ARTHUR I. GLADSTONE, HARRY G. YAMAGUCHI, 
AND CLARK L. HULL 


Institute of Human Relations, Yale University 


INTRODUCTION 


In the writing of the volume Principles of behavior (§), the author 
found it almost an expository necessity to have available at least 
the formal characteristics of a number of important mathematical 
behavioral principles or molar laws. ‘These were accordingly postu- 
lated even though elaborated evidence was at the time largely lacking. 
Moreover, the expository need of giving specific examples of the 
working of the general theoretical approach necessitated the as- 
sumption of numerous quantitative constants which are essential 
characteristics of the equations necessarily involved. Finally, also 
for expository reasons, a number of behavioral measurement units 
(the wat, the hab, the mote, and the pav) were employed, for which 
satisfactory quantitative definitions and genuine values had not been 
worked out. In short, a considerable part of the systematic structure 
in question was programmatic (5, p. 281). 

That such a state of affairs should continue longer than circum- 
stances make necessary is intolerable. Accordingly, early in 1943 
definite efforts to perform these quantitative determinations were 
begun (7, pp. 1, 2; 6, pp. 167 ff., 176 ff.), and since that time work 
has continued without interruption. The present study is the first of 
a series, in each of which we propose to make a tentative empirical 
determination of one or another aspect of one or more of the pre- 
sumptive laws in question. 

The history of the development of the older and more mature 
sciences strongly suggests the difficulties likely to be encountered in 
carrying out such a pioneering series of projects, and our own experi- 
ence thus far completely harmonizes with such expectations. A 
historical example resembling the present situation in more than a 


1 This is the first of a series of integrated studies, the objective of which is to make an ex- 
plorational determination of some quantitative behavioral principles which are presumed to be 
primary molar laws. The experimental work and measurement of records on this project was 
performed by Mr. Felsinger. The statistical analysis and curve fitting were done by Mr. Glad- 
stone and Mr. Yamaguchi. Mr. Hull supervised the investigation generally, and prepared the 
manuscript. 
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superficial manner is that concerning physical heat (1, 2). Among 
the major problems gradually solved by the scientists in this field 
were: the divising of suitable measuring instruments and methodol- 
ogies (Galileo, 1642); the invention of satisfactory units of measure- 
ment (Fahrenheit, about 1708, Celsius, about 1742, Kelvin, about 
1850); the discovery of the absolute zero (Amontons, 1701); the find- 
ing of equations expressing functional relationships such as that 
between heat and volume of gases (1787); the determination of 
certain critical empirical constants such as the mechanical equivalent 
of heat in Joule’s day (1846); and the results of atomic disintegration 
in ours. Another of the major historical problems involving heat, 
long a center of violent controversy among the best minds of the day, 
e.g., Laplace and Lavoisier, concerned the soundness and general 
useful applicability of such concepts or constructs as energy, phlogiston 
and caloric (1,2). Analogous problems in the field of behavior are: the 
finding of the equation expressing reaction potential (s/») as a func- 
tion of reaction latency (str); reaction potential as a function of the 
number of reinforcements (V); drive (D) as a function of the number 
of hours’ hunger; the determination of the distance of the absolute 
zero of reaction potential from the reaction threshold (sLe); the 
invention and quantitative definition of suitable units of measure- 
ment such.as the wat, the hab, and the mote; and so on (5). The 
controversies concerning energy, phlogiston and caloric in the field of 
heat present a striking general parallelism to those now existing in the 
field of psychology, centering around the rival concepts and con- 
structs used by various behavioristic and Gestalt groups. 

But the history of the evolution of the science concerned with 
heat not only warns us not to be overly optimistic about a quick 
victory; it also indicates the methodology of quantitative experi- 
mentation and mathematical formulation by which the earlier scien- 
tists achieved success. Even so, three hundred and fifty years elapsed 
after the invention of the thermometer before the nature and prop- 
erties of heat took on their present significance, and two and a half 
centuries elapsed before the thermometer gave rise to the principle 
of the conservation of energy by Helmholtz, about 1847. 

The present study concerns a small segment of the general be- 
havioral objective sketched above. As indicated by the title, it 
concerns the empirical determination of the molar law of reaction 
latency (str) as a function of the number of reinforcements (J). 
Formally stated, the problem whose answer we are seeking is, what is 
the function (f) in the expression 


str = f(N). 


i 


216 J. M. FELSINGER, A. I. GLADSTONE, H. G. YAMAGUCHI, AND C. L. HULL 


SUBJECTS 


The Ss employed in this investigation were 59 male albino rats purchased from the Albino 
Farms, Redbank, New Jersey. The ages at the beginning of training varied from 12 to 16 weeks. 
They were fed a diet of ‘Nurishmix’ dog biscuit supplemented daily by three drops of codliver 
oil. The animals were housed in round individual cages, 9 in. in diameter and 6.5 in. high. 


APPARATUS AND EXPERIMENTAL PROCEDURE 


The apparatus was a modification of the Ellson-Perin (3, 8) form of the Skinner box. This 
is a sound-shielded box with a restricted animal chamber 6 in. wide, 8.5 in. long, and § in. in 
height. Projecting horizontally } in. into this chamber through a slit in a brass panel on the 
posterior wall was a straight brass tubular manipulandum, } in. in diameter, the bore of the tube 
being 7s in. in diameter. This manipulandum was pivoted so as to move horizontally to the 
left under a pressure of approximately 7.5 gm. When moved a distance of jy in., an electrical 
contact was made behind the panel. This contact activated two electromagnets. The first 
magnet released a holding ratchet which permitted the manipulandum instantly to retract from 
the slit under the action of a long coil spring. The second magnet permitted a second tube 
holding a cylindrical pellet of specially prepared food projecting } in. from its end to be thrust 
forward through the slit in the panel at a lateral distance of yy in. from the position originally 
occupied by the manipulandum. This latter tube was activated by a weight and pulley com- 
bination slowed up near the end of its forward movement by an air dashpot so as not to frighten 
the animal. One-half in. in front of the manipulandum was a sheet-metal shutter which could 
be raised and lowered by a cord passing through a small hole in the wall of the box. 

Previous to the beginning of the experiment, each animal was thoroughly habituated 
(adapted) to the apparatus by being fed in it the same kind of food as that used as a reinforcing 
agent in the learning. With most animals nine 1o-min. feeding periods of this nature were 
given, at the rate of three per day. After each such period the animal was returned to its living 
cage. In the case of very timid animals, 1§ periods of 10 min. each, spread over five days, were 
given. At the last three periods the shutter was down when the animal was placed in the box. 
It was then raised and at the termination of the period it was lowered just before the rat was 
removed. All rats were handled, petted, and stroked by the £ at each habituation period. 

In order to induce the animals to operate the manipulandum on the first experimental trial, 
30 sec. after the raising of the shutter a cylinder of the same food was inserted into the mani- 
pulandum tube but so far back from the end accessible to the animal that the food could be 
smelled and also touched by the animal’s tongue, but could neither be seen nor eaten. The 
presence of this food led the animal to worry the bar in a random manner and in so doing ulti- 
mately to push it a little to the left. Such responses, because caused by the direct stimulation 
of the food, are classed as unconditioned reactions. As already pointed out, this latter displace- 
ment made an electric contact which resulted in the retraction of the manipulandum and at once 
thereafter a presentation of the food reinforcement. Actually the reinforcement tube itself did 
not project so far into the chamber as the manipulandum; indeed, it was not very clearly visible. 
But the cylinder of food projecting from it was definitely visible. This food was almost in- 
variably eaten as soon as presented, which gave practically instantaneous reinforcement. 

On the second experimental trial, given 24 hours later, the manipulandum tube was presented 
as before but with no food in it. If the animal had not operated it within 60 sec. of the time the 
shutter was raised (which constituted a conditioned reaction), the shutter was lowered and the 
manipulandum was baited with a food cylinder as on the first trial; then the shutter was again 
lifted until the animal had operated the manipulandum and eaten the reinforcing food, when it 
was lowered and the animal was returned to its cage. As always, if the rat responded to the 
food placed in the manipulandum the response was classed as an unconditioned, i.e., an un- 
learned, reaction. On the third and fourth trials the manipulandum was baited if the reaction 
had not occurred after 180 sec., and on subsequent trials it was not baited at all. The series of 
latencies utilized in the present study began with the first conditioned reaction following the 
last unconditioned reaction of the first four trials. 

Approximately one hour after receiving the reinforcing pellet following the daily training 
trial, each animal was allowed to eat in its cage for 45 min. This means that the trials or tests 
occurred at all times under an approximately 22-hour food-privation motivation. As already 


react 
techr 


perse 
duct 
woul 


durt 
case 

respe 
evide 
react 


conn 
ques 
for u 
of ti 
train 
vari: 
the 1 
the | 


stab 
orde 
The: 
rapi 
sligh 
acco 
sure 
vari 
appt 
to n 
app 
for» 
trail 


tion 
con: 
cons 
take 
thei 


lat 
po 
co! 
the 
res 
tre 
rel 
for 
na 


REACTION LATENCY AND REINFORCEMENTS 217 


suggested, the learning took the extreme form of distributed trials, consisting of a single reinforced 
reaction each 24 hours throughout the entire training of each animal. This unusually laborious 
technique was employed in order to avoid complications which otherwise would have arisen from 
perseverative ‘inhibition of reinforcement’ due to repeated reaction evocations (5, p. 289), re- 
duction in motivation due to the recent ingestion of food, and warming-up effects, all of which 
would have tended to distort the latencies which are the basic data of the investigation. 

It must be noted at this point that the above technique was inadvertently defective in that 
during the first four trials there was a tendency to select the shorter of some fast Ss’ latencies in 
case they began, for example, to respond in less than 30 or 60 sec. on the first and second trials 
respectively, or less than 180 sec. on the third and fourth trials. As will appear later, clear 
evidence exists that such a distortion appears in the means, at least, of the first two conditioned 
reactions. 

Both the movement of the shutter and the movement of the manipulandum had electrical 
connection with electromagnetic markers which recorded the occurrence of the movements in 
question on a pair of constant-speed, waxed-paper polygraphs. One polygraph, of slow speed 
for use in the early stages of learning when the reaction latencies were long, permitted the reading 
of times to approximately 0.2 sec. The second polygraph, for use in more advanced stages of 
training, permitted the reading of times to 0.01 sec. In order to minimize reaction-latency 
variability, the shutter was never lifted until the animal stood before it in readiness to operate 
the manipulandum. This could be seen by £ through a small plate-glass window in the top of 
the box. 

Reinforcements were continued with each animal until its reaction latencies had reached a 
stable minimum. This was difficult to determine because of the variability of such data. In 
order to locate it reasonably well, the median was obtained of each successive group of 10 latencies. 
These median values showed a fairly uniform course: they would fall toward a minimum at first 
rapidly and later more slowly. After having reached the minimum they would show a very 
slight and gradual increase in latency. In order to locate the point of minimal latency it was 
accordingly necessary to train the animals considerably beyond any suspected minimum to be 
sure that the true minimum had really been reached. After considerable preliminary trial with 
various rules for this procedure, the following was finally adopted: whenever a suspected minimum 
appeared at or before 21 to 30 reinforcements, the training was continued for 40 additional trials 
to make sure that some new minimum median did not appear. In case a suspected minimum 
appeared between 31 and 40 trials, 41 and §0 trials, or 51 and 60 trials, training was continued 
for 30 additional trials. If a suspected minimum appeared between 61 to 70 trials or beyond, 
training was continued for 20 additional trials. 

As already pointed out, responses to the baited manipulandum were considered uncondi- 
tioned reactions. Only responses successfully carried out on the unbaited manipulandum were 
considered conditioned, i.e., learned, reactions; the reaction latencies of these latter responses 
constitute the chief data of the present study. While the latencies of the former variety were 
taken, their value is considered of no special significance and no use was made of them. However, 
their number was regarded as extremely significant for the determination of the reaction threshold 
(,L_), as will be shown in a subsequent study. 


RESULTS 


_ The experimental technique described above yielded the reaction 
latencies of 59 animals from the first trial to considerably beyond the 
point at which each reached the level where its latency was no longer 
consistently decreasing. At the outset of our attempt to determine 
the molar law, str = f(N), we face the problem of variability in 
response latency. It is obvious that these latency data must be 
treated in some special way before any stable and consistent lawful 
relationship between the reaction latency and the number of rein- 
forcements can become evident. The manner of the treatment 
naturally depends upon the nature of the variability distribution. 
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It happens that an indication of the distribution of these latencies, 
in so far as it depends upon the variability of the individual organism 
from itself uncomplicated by practice effects, is yielded by the 
latencies of the responses which proved to mark the attainment of 
the latency asymptote and beyond. A typical distribution by an 
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REACTION LATENCY IN SECONDS 
Fic. 1. A typical distribution of 88 reaction latencies given by subject no. 31 during its 


10 asymptote latency reactions and subsequent over-training. Each circle represents one 
reaction latency. 


animal having a considerable number of responses including’ the 
latency asymptote and beyond (Rat No. 31) is presented for illustra- 
tive purposes in Fig. 1. A glance at this figure reveals that the 
distribution is markedly skewed, the data showing a decided tendency 
to being bunched toward the extreme of the shorter latencies. This 
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skewness is reflected in a quantitative manner by the fact that the 
mean has a value of 1.35 sec., whereas the median has a value of only 
0.77 sec. 

There is also a different type of variability involved in the present 
latency data. This second type of variability is that due to the 
individual differences in reaction latency among the various organ- 
isms employed. A rough indication of the range of these latter 
differences is afforded by the distribution of the means of the latencies 
of the 59 animals at their several latency asymptotes and beyond. 
This latter distribution is shown in Fig. 2. The mean of this dis- 
tribution of means is 1.900 sec., whereas its median is 1.084 sec. 
The skewness of the central tendencies of the organisms as a group 
is as marked as is that of the individual organism. 

It is clear from the above evidences of the degree and nature of 
the variability in our data that the obvious manner of securing an 
indication of the functional relationship of reaction latency to the 
number of reinforcements is to find the relationship of the central 
tendencies of the former to the latter. Owing to the skewness of the 
data, the two types of central tendencies are destined even for the 
very same data to differ markedly, as shown by the computations 
based on Figs. 1 and 2. This means that in some sense we shall 
emerge from our investigation with as many laws, or as many variants 
of the same law, as there are types of central tendencies employed. 


e 
ee 


0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 
REACTION LATENCY IN SECONDS 


Fic. 2. The distribution of the mean reaction latencies of the 59 animals of the present study 
for the period of their latency asymptote and during their subsequent over-training 
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A further complication in the way of determining a lawful func- 
tional relationship lies in the fact that some animals learn much 
faster than others, reaching their latency asymptotes while others 
are still in the improving stage. Now, in many studies the learning 
data have been pooled simply on the basis of the number of trials or 
reinforcements, regardless of the stage of learning attained by the 
individual organisms involved. On the other hand, some investi- 
gators have equalized the stages of learning of the several Ss before 
the computation of the central tendencies. Since the two procedures 


TABLE I 


The mean and median reaction latencies (stp) of 59 albino rats at the first 60 conditioned 
reactions on a simple bar-pressing act, together with the calculated latencies obtained by use 
of the equation, 

stp = 25.6 N-1.18 + 


which was fitted to the adjacent column of medians. The present data were not equalized for 
differential rates of learning of the individual organisms. 


Mean Median Mean Median 
N 
Empirical Empirical | Calculated Empirical Empirical | Calculated 
I 43.90 25.00 25.900 31 4.63 .70 772 
2 63.75 12.73 11.732 32 1.52 -70 755 
3 81.15 8.31 7-434 33 1.54 75 :739 
4 43.00 7.80 5.406 34 1.07 65 724 
5 25.18 4-45 4.239 35 1.65 -70 710 
6 43-76 4.75 3.486 36 81 .60 .697 
7 25.13 3-35 2.963 37 2.37 .70 684 
8 10.17 2.95 2.580 38 1.38 65 .672 
9 16.52 2.01 2.288 39 1.24 74 661 
10 15.90 2.00 2.059 40 3.64 72 651 
11 18.36 1.36 1.874 41 .87 .60 641 
12 18.47 1.96 1.723 42 1.03 65 631 
13 8.70 1.89 1.596 43 1.26 61 -622 
14 13.60 1.70 1.489 44 74 59 614 
15 13.87 1.15 1.398 45 1.59 59 .606 
16 13.03 1.07 1.318 46 1.41 62 .598 
17 3.62 95 1.249 47 2.04 56 
18 4.78 1.07 1.188 48 2.87 584 
19 3.82 .gO 1.134 . 49 1.13 .60 577 
20 6.94 .go 1.085 50 1.28 .56 571 
21 2.83 Reve) 1.042 51 57 564 
22 6.21 85 1.003 $2 88 65 559 
23 2.52 .96 .968 © 53 1.06 59 553 
24 6.47 .80 935° 54 77 62 547 
25 2.52 .80 .g06 55 2.08 57 542 
26 2.29 79 879 | 56 72 55 537 
27 1.80 70 854 57 1.14 .56 532 
28 1.72 70 831 58 4.46 54 .528 
29 1.43 go 810 59 1.25 .56 523 
30 1.78 81 790 60 85 519 
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naturally yield somewhat different results, a given type of central 
tendency may yield in some sense a different law, depending upon 
whether or not the stages of learning of the Ss are equalized before 
being pooled. In order to secure a fairly comprehensive expression 
of our data we shall treat all four major combinations. Because the 
statistical procedure is somewhat simpler, we shall treat first the un- 
equalized data, the central tendencies being calculated as means and 
medians respectively. 

These results are shown in Table Il. ‘There it may be seen that the 
means give a considerably less consistent course, e.g., between 
N = 25 and 60, than do the medians. Both diminish very rapidly 
at first as the number of reinforced conditioned evocations increase. 
However, the rate of reduction progressively decreases as N increases, 
until after 32 to 35 reinforcements the decrease is almost imper- 
ceptible to ordinary inspection. ‘This is shown in a graphic manner 
by Fig. 3, which represents the means presented in the second columns 
of Table I. It is to be noted that the latencies at V = 1 and 2 (rep- 
resented by non-filled circles for this reason) are considerably smaller 
proportionately than are those for N = 3 and 4. This evidently 
results from a defect of the experimental technique mentioned in the 
description of the procedure. Fortunately, this apparently has rela- 
tively little effect upon the medians, as may be noted by an inspection 
of the comparable empirical values. 

The relationship of the two central tendencies of the same (non- 
Vincentized) latency data is shown in Fig. 4, which presents in 
parallel each of the two curves determined by calculating the central 
tendencies of the two sets of empirical values in Table I by successive 
tens of trials. In this figure it may be seen that both curves follow 
the same general course, but with a difference. ‘The distance between 
them is maximal at the outset but grows less and less as the number 
of trials increase. Moreover, the rate of decrease itself grows pro- 
gressively less as N increases. 

We next take up the corresponding means and medians of the 
latencies of the same 59 Ss after the inequalities due to unequal 
learning rates were equalized. This equalization was attained 
by treating the data by what has been called a justly weighted 
Vincent technique (4, p. 245, footnote). This technique reduces 
series of any length to comparable new series of any desired number 
of items, the mean of which shall be exactly the same as the mean of 
the original series. Since the median number of conditioned reactions 
to the latency asymptote was 60, all records were reduced to the 
equivalent Vincent series of 60 values. Following this reduction of 
all 59 sets of latency records, both the means and the medians for 
each of the 60 ‘Vincent’ trials were calculated. 
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Fic. 3. Graphic representation of the relationship of mean non-equalized reaction latency (stp) 
to the ordinal number of reinforced reaction evocation 
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These values are shown in Table II in a manner exactly com- 
parable to Table I. There it may be seen that the means and the 
medians of Table II are of about the same magnitude as at com- 
parable N’s in Table I. Also as before, the means are considerably 
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larger than are the medians, especially at small N’s, but this difference 
diminishes to very small proportions at large values of N. There is 
also the difference, as already pointed out in connection with Table I, 
that the means show a considerably larger number of erratic long 
deviations from the majority of the other means than occur among 
the medians, especially after N = 30. A graphic representation of 
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Fic. 4. Graphs of the means and medians respectively of the empirical data of Table I, 
averaged by tens and arranged to demonstrate the quantitative differences produced by the 
two methods of computing the central tendencies of identical data 


the reduction of median reaction latencies as N increases is shown in 
Fig. 5, the various latencies being represented by the black circles 
which, incidentally, are not individually connected by separate lines 
as are those of Fig. 3. 

Having before us the four types of central tendency of the reaction 
latencies, we are now in position to attempt the determination of the 
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TABLE II 


The mean and median reaction latencies (stp) of 59 albino rats at the first 60 conditioned 
reactions on a simple bar-pressing act after differential rates of learning had been equalized by 
the equally weighted Vincent technique (4, p. 245). The fourth column contains the correspond- 
ing median values as calculated from the equation, 


str = 33.0 N12 + .25, 


which originally was fitted to the adjacent column of empirical data. 


Mean Median Mean Median 
N N 
Empirical Empirical | Calculated Empirical Empirical | Calculated 
I 49.96 28.50 33.250 31 1.60 
2 S8.11 17.60 14.614 32 1.72 70 .766 
3 70.40 12.73 9.080 33 1.53 .69 ‘747 
4 44.26 7.80 6.502 34 1.39 71 729 
5 40.41 5-10 5-034 35 1.77 75 713 
6 24.64 4.15 4.094 36 2.94 .62 .698 
7 22.90 3.30 3-444 37 4.63 .66 683 
8 19.85 3.00 2.971 38 1.73 73 .670 
9 33-46 3.27 2.613 39 1.27 82 .657 
10 23.95 1.93 2.332 40 I.1I 61 .644 
II 13.49 1.63 2.107 41 1.06 .68 .633 
12 15.51 1.52 1.923 42 .68 61 .622 
13 7.62 2.00 1.770 43 72 .56 612 
14 12.64 1.90 1.640 44 1.08 58 .602 
15 17.99 1.67 1.530 45 1.38 63 593 
16 32.31 1.80 1.435 46 .98 62 584 
17 15.01 1.62 1.351 47 2.38 61 575 
18 12.96 1.15 1.278 48 3.64 56 567 
19 7.28 1.05 1.214 49 .66 54 559 
20 9.16 1.21 1.156 50 9-79 -49 552 
21 6.93 1.04 1.105 si 2.81 47 545 
22 3.48 97 1.058 52 63 51 538 
23 3-53 JI 1.016 53 -78 50 531 
24 4.09 rele) -978 54 1.23 +525 
25 2.01 .g2 .943 55 .69 -519 
26 3.28 1.02 .g12 56 71 52 513 
27 2.78 .86 .882 57 72 51 .508 
28 1.96 1.12 855 58 65 51 503 
29 1.78 97 59 ‘75 47 -497 
30 2.88 82 .807 60 1.45 51 493 


molar law which states in relatively simple equational form a first 
approximation to the quantitative relationship of stg to N. We have 
felt that the explorational nature of the present investigation does not 
warrant the great labor involved in the complete least squares fitting 
techniques. We have accordingly fitted equations to the present 
empirical data roughly as follows: 


1. A dot distribution of the (ste) data whose law is sought was 
plotted on ordinary graph paper, much as that shown in Fig. 5, and a 


REACTION LATENCY AND REINFORCEMENTS 225 


smooth curve was drawn by free-hand through the central tendencies 
of the more or less scattered data to secure a rough notion of the 
general shape of the function involved. 
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Fic. 5. Spatial representation of the relationship of median reaction latency equalized for 
unequal learning rate which is represented by the black circles to number of reinforced trials. 
The smooth curve running among the circles has been plotted from the calculated values given 
in the fourth columns of Table II. 


26 
S 22 
12 
10 
8 
4 


226 J. M. FELSINGER, A. I. GLADSTONE, H. G. YAMAGUCHI, AND C. L. HULL 


2. From a general knowledge of the graphic forms yielded by 
various common types of equations, the two or so most likely to yield 
curves close to the free-hand one drawn through the data were selected. 

3. After this selection the data were plotted on either semi-log 
or log-log graph paper, according to the equation choice regarded as 
probably most suitable. 

4. When a method of plotting was found which gave on the ap- 
propriate type of graph paper the closest approximation to a straight 
line. measurements were made of the position of this line on the 
paper, the characteristic constants of the corresponding equation 
were calculated from these measurements, and the equation was 
tentatively written. 

5. The values of the independent variable (V) were then sub- 
stituted successively in the tentative equation and the calculated 
values were compared with the corresponding empirical values, the 
deviations were found and squared, and the squares were averaged. 

6. In general, various slightly varying sets of constants for any 
given type of equation were tried out in this manner, that one being 
finally tentatively selected which yielded the least mean square 
deviation from the empirical values. In case the early trials were 
not very good fits, one or more other types of equation were tried out, 
that equation with that combination of constants which yielded the 
smallest mean square deviation of all being finally selected. 


As a concrete example of the outcome of this general procedure, 
an equation so found is: 


str = 33.0 N7!-? + .25. (1) 


This was fitted to the values of columns 1 and 3 of Table II. The 
calculated values of stg secured by substituting the various values of 
N one at a time in this equation and solving, are given in columns 
4of Table II. From these fitted values of ste was plotted the smooth 
curve which runs freely among the circles of Fig. 5. A mere glance 
shows that the fit obtained is a reasonably good first approximation. 
As a matter of fact this is the best fit achieved by us on any of the 
four sets of data. However, the fit attained in the same manner to 
the medians of the non-Vincentized latencies of Table I is practically 
as good, the equation in this case being 


str = 25.6 N-1-163 0.3. (2) 


The calculated values secured by the use of this equation are given 
in the fourth columns of Table I. 

Considerably greater difficulty was encountered in getting a 
reasonably good fit with the mean latencies, particularly either in the 
region from N = 8 to 18, or in that from N = 1 to 10. This doubt- 
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less was in part due to the distortion of the first two or more of both 
of these sets of latencies, as already mentioned. ‘The equation fitted 
to the non-equalized mean latencies (completely disregarding the 


first two) which showed minimal deviations from the empirical values 
was 


ste = 615.63 N-!-69 + 0,35. (3) 
That secured for the Vincentized mean latencies was 
ste = 574.5 N* + 0.35. (4) 


The curves obtained from both equations (3) and (4) fall slightly 
below the central tendencies of the major groupings of the empirical 
ste data in the region of N = 8to18. ‘They agree with equations (1) 
and (2) in being of the same type, namely, decreasing power functions 
but with appreciably differing constants. 


SUMMARY 


Fifty-nine male albino rats were trained by distributed trials to 
make a simple bar-pressing movement until each organism was well 
beyond its reaction-latency asymptote, and the resulting latencies 
were measured. ‘The object of the investigation was to determine 
the molar law according to which reaction latency varies with the 
number of reinforcements uniformly following conditioned-reaction 
evocation. It was found that: 


1. When uncomplicated by practice effects the typica! distribution 
of the reaction latency of the individual organism from moment to 
moment is markedly skewed, with the mean considerably greater than 
the median. 

2. The variability of the reaction latency for a group of organisms, 
when practically uncomplicated by either practice effects or individual 
behavioral oscillation, was likewise markedly skewed, the mean in 
this case also being considerably larger than the median. 

3. In general, the central tendency of the reaction latency, when 
under the influence of learning, is larger when measured by the mean 
chan when measured by the median, the difference between the two 
being much greater in the early stages of training. The equalization 
of the individual rates of learning makes surprisingly little superficial 
difference in either measure of central tendency. 

4. By all four methods of calculating the central tendency of the 
reaction latencies of the group, it decreases as reinforced reaction 
evocations increase, the rate of decrease being extremely rapid at 
first and extremely slow later as the latency approaches its asymptote. 

5. The molar law of str as a function of N has in a sense four 
forms according to the measure of central tendency employed and 
depending on whether the S’s rate of learning is equalized. 
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6. The attempt to formulate the specific relationship of reaction 
latency as a function of the number of reinforced reaction evocations 
in all four cases leads to the view that the molar law in question is a 
power function of the general form, 


ste = 


where a, b, and c are constants. Of these constants, 

7. The coefficient a, which represents the maximal conditioned 
reaction latency, falling at N = 1, is considerably larger for means 
than for medians, as might easily have been anticipated, ranging from 
several hundred sec. for means down to 25 or 30 sec. for medians. 

8. The exponent }b, which represents the rate of decrease of stp 
as N increases, varies in the region of 1.1 to 1.7, being larger for means 
than for medians. 

g. The constant c, presumably representing the physiological 
limit of latency, varies from .25 to .35, the means again being slightly 
larger. 

10. In this particular study, possibly because of the defect in the 
first two or so experimental data, the medians are believed to yield 
the more significant measures. 

11. Of the medians, the one based on the Vincentized latency 
values is believed to be distinctly the more significant of the two 
because distortions due to learning inequalities are largely eliminated. 
Accordingly the best approximation of the quantitative molar law 
relating ste to N emerging from this pioneering investigation is 
believed to be: 

str = 33.0 N71}? + .25. 


(Manuscript received July 10, 1946) 
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THE DIFFERENCE BETWEEN MONAURAL 
AND BINAURAL THRESHOLDS 


BY W. A. SHAW 


University of Pennsyloania 


AND E. B. NEWMAN AND I. J. HIRSH 


Harvard University 


When both ears of a normal observer are stimulated with a tone 
of Supraliminal intensity, the loudness which is perceived is greater 
than the loudness when either ear is stimulated alone. ‘The nervous 
mechanism appears to be such that there is an addition or mutual 
reinforcement of the effects of stimulation in each ear. These facts 
seem to be well established and are generally accepted. 

Less general agreement obtains about what happens when the 
stimulation is of liminal or threshold intensity. Can sounds which 
are actually slightly subliminal in both ears reinforce each other so that 
together they produce a supraliminal binaural effect? More specific- 
ally, is the binaural threshold significantly lower than the monaural 
threshold? 

Sivian and White (6) have taken the view that there is no summa- 
tion. ‘They reported that the binaural threshold was equal to the 
threshold of the most sensitive of the two ears alone. Several other 
observers have presented evidence to show that the one ear does aid 
the other even at the threshold of hearing. ‘These writers have dis- 
agreed about the amount of difference between monaural and binaural 
thresholds, but they have agreed that the binaural threshold is 
generally lower. 

It is the purpose of the present report to review critically the 
experiments concerned with the difference between monaural and 
binaural thresholds, and to report the results of three new sets of 
measurements made by different methods.' We believe that it can 
be shown that the previous results are in considerable measure a 
function of the methods used, and that suitable methods show a 
substantial additive effect of the two ears at threshold levels. 


1 This work was begun at the Psycho-Acoustic Laboratory under a contract between Harvard 
University and the Office of Scientific Research and Development, and has been completed under 
Contract Nsori-76 between Harvard University and the U. S. Navy, Office of Research and 
Inventions, Report No. PNR-22. 
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HIsTorRICAL 


The view that there is no summation of loudness at the threshold 
was clearly expressed by Sivian and White (6) in 1933. These Fs 
made a limited number of monaural and binaural determinations of 
the minimum audible field. . 

The O was suspended in one corner of a ‘highly absorbent’ sound 
stage located within a sound-proofed room. ‘The sound came from a 
loudspeaker one meter in front of him. The procedure was the one 
commonly used for audiometric tests. The O signalled the E by 
means of a key as long as he could hear the tone. At the beginning 
of each test, the tone was continuous, and it was reduced in five db. 
steps. As the threshold was approached, the £ began to interrupt 
the tone and to reduce the intensity of the tone in smaller steps until 
near threshold they were usually one db. ‘“The operator judged from 
the ability of the observer to follow the interruptions with his key 
whether or not he heard the tone at each intensity” (6, p. 293). 

Monaural thresholds were determined for a total of 14 ears: both 
the right and left ears of five men and the right ears of four women. 
Binaural data appear to have been taken for a total of 12 Ss. Of the 
entire group, directly comparable monaural and binaural data are 
available from only two Ss. 

“For two members of the group data were available binaurally 
as well as on each ear separately. These data show no significant 
difference between the binaural M.A.F. and the best ear M.A.F. 
Accordingly, for the three others in the group the best ear M.A.F. 
was taken to be the binaural M.A.F.” (6, p. 295). Since, unfortu- 
nately, the data on which this conclusion is based are not given, this 
statement must be taken at its face value. Nor can any conclusion 
be drawn from a comparison of the plotted values for the monaural 
M.A.F. and the binaural M.A.F. The monaural values involve an 
average of both ‘better’ and ‘worse’ ears, while the binaural data 
involve an averaging of ‘best’ ear data for some observers and direct 
binaural determinations for others. 

Aside from this lack of information, at least three things tend 
to obscure any differences which might have occurred. First, it is 
dificult to determine from the description of the procedure whether a 
proper method of limits was used, or whether the EF judged, simply by 
watching the O’s signal, what value of attenuation to record as the 
threshold value. Secondly, the case rests upon the results from only 
two Os in an unspecified number of trials. Lastly, the monaural 
threshold for one ‘group’ cannot be compared with the binaural 
threshold for another ‘group’ because the total number of Os is not 
sufficiently large to make the group averages sufficiently reliable. 
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Fletcher and Munson (2) discuss summation at the threshold 
incidental to their use of the difference between monaural and 
binaural listening in constructing the function relating loudness to 
intensity. “Their conclusions with respect to tones of threshold in- 
tensity are somewhat ambiguous. In certain statements they seem 
to favor the Sivian and White assumptions. On the other hand, 
their basic formula strongly implies that summation takes place at 
liminal as well as supraliminal intensities. It is evident that the unit 
of loudness is the loudness of a tone at threshold. Fractional values 
of this unit might perfectly well exist in terms of a very low level of 
stimulation in the ear or auditory nerve. ‘The only assumption this 
involves is that the threshold is ultimately a cortical phenomenon and 
that therefore low levels of peripheral activity may be added together 
before the threshold mechanism is reached. It is interesting to note 
that Fletcher and Munson actually give fractional values of loudness 
less than one in their table. | 

The lowest level at which experimental determinations of the 
loudness balance between the monaural and binaural conditions were 
carried out was approximately 13 db. above threshold. If this 
function, based upon loudness balances, is extrapolated down to 
threshold, one could estimate that there is a difference of about three 
db. between the monaural and binaural thresholds. 

These observations are unfortunately complicated by the fact 
that Fletcher and Munson’s monaural observations involve use of 
the ‘average’ ear. The ‘better’ ear will average two to three db. 
lower in its threshold than the average ear simply because the two 
ears will, on the average, differ by approximately twice that amount. 
It is possible, therefore, that the difference between monaural and 
binaural thresholds can be explained entirely in terms of this differ- 
ence between the average ear and the best ear. 

In conclusion, the writers state that “when listening with both 
ears, the threshold is determined principally by the better ear. 
However, some experimental tests which we made on one-ear acuity 
vs. two-ear acuity showed the latter to be slightly greater than for the 
better ear alone, but the small magnitude involved and the difficulty 
of avoiding psychological effects caused a probable error of the same 
order of magnitude as the quality [sic ] being measured” (2, p. 106). 

More direct evidence, however, is not altogether lacking for sub- 
liminal reinforcement. Gage (3) has reported a series of measure- 
ments for 22 different frequencies in which the binaural threshold is, 
on the average, just over 1.0 db. lower than the best monaural 
threshold. It is of course much lower, by 5.6 db., than the less 
sensitive ear. 
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Gage’s procedure is described as follows: A small dynamic loud- 
speaker was located in a small sound-proofed box opposite a rigid 
tube leading to an acoustic filter. From the filter the tube bifurcated 
and each of the channels was terminated in an earpiece similar to an 
earphone cushion. Part of each channel consisted of rubber tubing 
which could be closed by the use of a clip. The £, working in a 
sound-proofed room, controlled the apparatus. Six observations 
were made at each frequency for each condition, 1.e., right and left 
ear monaural thresholds and the binaural threshold. 

In Table I the differences between the more sensitive monaural 
and the binaural, and the less sensitive monaural and the binaural 
thresholds are presented. 


TABLE I 


DirFerReNces BETWEEN MORE SENSITIVE MONAURAL AND THE BINAURAL, AND THE LESS 
SensiTIveE MonaAURAL AND THE BINAURAL THRESHOLDS, AS REPORTED BY GAGE (3) 


More Sensitive Less Sensitive More Sensitive Less Sensitive 
Frequency Minus Binaural | Minus Binaural Frequency Minus Binaural | Minus Binaural 
in db. in db. in db. in db. 
100 0.4 3:3 800 2.9 5.8 
120 0.0 2.9 1000 —1.7 4.6 
140 —1.7 0.8 1200 0.8 2.2 
160 2.9 5.8 1400 3.3 7.5 
180 2.9 7.5 1600 —1.7 6.7 | 
200 5.1 10.8 2000 5.0 
250 —0.8 3.8 2400 —1.2 2.9 
300 0.8 2.1 3000 0.0 6.2 
400 3.3 8.8 4000 0.8 3.8 
500 5.8 12.8 6000 - 0.0 8.3 
600 0.4 2.9 8000 1.7 4.1 


The author concludes that the term ‘binaural’ can have more than 
one meaning. Clearly the monaural threshold is higher than the 
binaural but the magnitude of the difference between them appears 
to depend as much, or perhaps more, upon what definition of monaural 
threshold is adopted than upon what happens to the binaural thresh- 
old. Is the monaural threshold, for example, that of (1) the more 
sensitive ear, (2) the less sensitive ear, or (3) the average of the more 
and less sensitive ears? Depending on which of these is chosen, the 
difference turns out to be 1.0 db. between the binaural and more 
sensitive ear, 5.6 db. for the less sensitive ear, and 3.3 db. for the 
‘average’ ear. 

These results agree quite well with results obtained with more 
careful techniques. It is possible, of course, that stopping off one 
of two interconnected tubes carrying sound may change the level 
in the other. It is probable that the monaural thresholds at some 
frequencies may be a bit too high on this account. 
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More convincing evidence is provided in the study of subliminal 
summation by Hughes (5), who determined thresholds for each ear 
separately, using tones produced by independent oscillators, am- 
plifiers and earphones. The threshold was then redetermined with a 
variety of subliminal stimuli in the contralateral ear. Inthe simplest 
case, for example, the contralateral tone was of the same frequency 
and was set at 3, 4 or 5 db. below the threshold for that ear. It was 
found that the threshold for the original ear was then lower by 3.1, 
2.2, and 1.8 db., respectively. These results may be readily gener- 
alized by suggesting that the nervous excitation at threshold is ap- 
proximately proportional to the energy of the stimulating tone, and 
that the threshold in one ear is lowered by an amount directly related 
to the energy delivered to the opposite ear. It is evident that energy 
must be expressed in each case relative to the energy required for the 
monaural threshold. 

A considerable portion of Hughes’ work is concerned with the 
summation produced when the frequencies used in the two ears are 
not the same. ‘Thus, the presence of a subliminal stimulus of 2000 
cycles in one ear, lowers the threshold for 1000 cycles (and for other 
frequencies as well) in the opposite ear. ‘Two different frequencies 
appear to summate just as well as does one. ‘This fact, if true, has 
rather profound implications as to the nature of the threshold 
mechanism, and deserves to be carefully checked. 

In conclusion it may be said that certain facts seem to stand out 
in the survey of previous work. It will be well to restate them here. 


(1) The two ears of a typical O commonly differ in their sensi- 
tivity at any given frequency. ‘They may differ by an amount any- 
where between zero and as much as 10 db. in either direction. The 
average difference for normal Ss is about five db. 

(2) When large differences in sensitivity exist between the ears, 
or when the stimulus in one ear is substantially below the threshold 
for that ear alone, the more sensitive ear determines the binaural 
threshold with little or no assistance from the less effectively stimu- 
lated ear. 

(3) When stimuli are delivered to the two ears which are equally 
effective, as measured by the threshold of each alone, the binaural 
threshold is substantially lower than the monaural thresholds. 
Estimates of the exact amount vary from one to three db. 

(4) If the energy delivered to each ear in determining a binaural 
threshold is expressed as a fraction of the energy required to reach 
the monaural threshold in either ear, the sum of the fractions will be 
found to equal one. ‘There is no a priori reason, however, why this 
relation should hold. ‘The binaural summation must be physiological 
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in character and it would be largely fortuitous if the effective neural 
excitation varies directly with the energy of the stimulus in the 
vicinity of the threshold. 


EXPERIMENT I 
Procedure: 


In the course of work at the Psycho-Acoustic Laboratory directed to a number of practical 
ends, three experiments have been performed which afford both significant corroboration of the 
results of previous investigators and a demonstration of the fact that the principal differences 
in results stem from differences in method. 

The first experiment was performed by first determining the threshold for each ear inde- 
pendently and comparing this with the threshold when equal intensities of sound are introduced 
into the two ears at the sametime. ‘This procedure measures the ‘normal’ amount of summation 
experienced by a listener using a balanced headset or located in a symmetrical sound field. The 
summation will be less than the ‘maximum’ summation whenever there is a difference in the 
sensitivity of the listener’s two ears. 

One important experimental control must be carried out. Even selected earphones differ 
from one another in sensitivity at particular frequencies. The extent of this difference is not 
readily determined and the best result will be obtained if the two earphones used are systematically 
exchanged between the two ears. As a result six thresholds, rather than three, are required in 
the complete experimental design. The six test conditions are the following: 


Right monaural threshold: Earphone A (Serial No. 13). 
Right monaural threshold: Earphone B (Serial No. 6). 
Left monaural threshold: Earphone A. 

Left monaural threshold: Earphone B. 

Binaural threshold: Earphone A on right, B on left ear. 
Binaural threshold: Earphone B on right, A on left ear. 


The actual order in which these six tests were carried out by any one S was determined by drawing 
numbers at random from a box. 


In this experiment two Permoflux PDR-8 dynamic earphones served as the source of sound. 
They were connected as shown in Fig. 1. The test frequencies were obtained from a General 
Radio 913-A beat-frequency oscillator. The tone was passed through a cam-operated electronic 
switch which interrupted it irregularly without producing any clicks. 


A 
3 HEADSET 
a 
MATCHING 
OSCILLATOR ELECTRONIC = ATTENUATOR TRANS- role 
SWITCH FORMER 
BALLANTIN 
v.T. 
VOLTMETER 


Fic. 1. Block diagram of equipment for Experiment I 


The tone then passed through an attenuator and matching transformer to the earphones. 
The O was able by varying the attenuator setting to adjust the intensity of the test tones in 
steps of two db. Whenever two earphones were used, they were connected in parallel across 
the five-ohm tap of the transformer. The earphones were mounted in the Army-type HB-7 
headband, which was adjusted to apply approximately 1000 gm. pressure. They were equipped 
with the Army’s MX-41/AR soft neoprene cushions. This mounting of the earphones has been 
shown to produce quite uniform and excellent coupling to the ear. 
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The:10 Os used in this experiment had essentially normal hearing in both ears when tested 
by the use of an audiometer. They had been thoroughly trained with daily practice in making 
threshold observations previous to this test. All the tests were conducted in a quiet, ‘anechoic’ 
(echo-free) room. 

The initial level of each test tone was adjusted until it could be clearly heard. In accordance 
with a procedure previously standardized for similar tests, the O was instructed to attenuate 
the signal until he reached the region of his threshold. He was further instructed to ‘search,’ 
i.e., to increase the attenuation until he was certain that no sound could be heard, and then to 
decrease the attenuation until the sound could again be heard. At this point the O continued 
to adjust the attenuation until he was fully satisfied that he had reached the threshold. No time 
limit was imposed upon him. 

Not all of the Os had precisely the same criterion of threshold. They had all reported in 
previous tests that near threshold it was hard to follow the ‘interrupted’ characteristic of the 
tone, produced objectively by the electronic switch, and it consequently appeared to become 
steady. All Os attenuated the test tone beyond this point. On the other hand, Os also report 
that when they have reduced the signal beyond the point where they can clearly follow its in- 
terrupted character, the signal is still heard in a highly irregular manner. At one moment the 
test tone seems to disappear completely; a moment or two later it may be heard again as a ‘burst’ 
of tone of startling clarity. Many of the Os used this as their criterion of threshold. Others, 
who have difficulty in recalling the ‘sound’ of the tone when working near the threshold, can 
rarely go much beyond the point where they follow the objectively produced interruptions. 
This difficulty in recalling the tone when working near the threshold is frequently reported by Os 
when ‘head noises’ are present. 

Fourteen frequencies were used in the tests. The initial frequency was chosen at random 
and the other frequencies were tested in order up or down the scale. Thresholds at all 14 fre- 
quencies were determined at one time for one of the test conditions mentioned above. As a 
rule, 28 thresholds (14 frequencies for two test conditions) were determined at one sitting. An 
interval of two to four days occurred between sittings. All Os determined a single threshold 
under each test condition. They did their own attenuating and recorded their own results. 


Results: 


Before comparing the results of the binaural and monaural thresh- 
olds, it will be of interest to examine briefly the monaural resuits 
alone. In particular, it will be valuable to see how large is the 
difference between the better ear and the average ear at each of the 
test frequencies. It will be recalled that the entire difference between 
monaural and binaural listening has sometimes been ascribed to this 
factor. 

Since each ear was tested with two earphones, it was necessary 
first to average the results of test conditions 1 and 2, and of conditions 
3 and 4, for all 10 Os at all frequencies tested. If now the larger of 
the two attenuator readings for each observer is chosen, an average 
of these values will represent the monaural threshold for the ‘better’ 
ear. Similarly, if the results for both ears are combined, a value for 
the ‘average’ ear is obtained. ‘These values are presented in Table II 
for the 14 test frequencies, together with the difference between the 
better ear and the average ear. 

The differences found in this table between the better ear and the 
average ear are closely comparable with the values reported by 
Fletcher and Munson. It can be safely argued that the present 
results effectively duplicate those obtained in the earlier experiment. 


‘. 
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TABLE II 


CoMPARISON OF MONAURAL THRESHOLDS FOR THE BETTER Ear AND AVERAGE EAR 
oF 10 Os 1n Terms oF Decispets ATTENUATION BELOW AN ARBITRARY VOLTAGE 


Frequency Better Ear Average Ear Difference 
100 24.0 22.2 1.8 
250 51.7 50.2 1.5 
500 63.2 61.9 1.3 
750 67.7 66.3 1.4 

1,000 70.3 68.4 1.9 
1,500 69.8 67.7 2.1 
2,000 70.3 68.5 1.8 
2,500 72.7 69.6 3.1 
3,000 74.2 71.8 2.4 
4,000 69.0 65.8 3.2 
5,000 65.5 63.0 2.5 
6,000 65.4 62.6 2.8 
8,000 51.2 47-7 3-5 
10,000 33.8 28.2 5.6 


On the other hand, there is one quite serious limitation of these 
values, a limitation which we believe is applicable to all results so far 
reported. Stated simply, the greater sensitivity of the better ear 
will necessarily be overestimated by this procedure since some of the 
deviation of the better ear from the mean will be a product of random 
rather than systematic factors, and these random factors will not 
operate equally in the case of the average and of the betterear. For 
example, let us suppose that the ears of certain of the Ss had exactly 
the same sensitivity. If now there is some lack of reliability in the 
measurements of these two ears, one measurement will by chance 
exceed the other. This ear arbitrarily becomes the ‘better’ ear, and 
its random variation from the mean is ascribed to the systematic 
difference between the ears. 

It may be concluded then that the differences reported in Table 
II are too large. Unfortunately it is not possible from data at hand 
to estimate just how large is the degree of over-estimation. 

Of considerably greater interest is the comparison of the monaural 
and binaural thresholds. The binaural results are obtained by 
averaging the thresholds for each O under test conditions 5 and 6. 
Table III presents the average binaural thresholds for all 10 Os 
together with the comparable thresholds for both the better and 
average single ear. 

An examination of this table makes it apparent that there is a 
substantial and consistent difference between the average monaural 
threshold and the binaural threshold, amounting to from two to six 
db. If we assume that the monaural thresholds were determined 
by a truly random sample of ears, we find that the ‘normal’ binaural 
threshold will be about four db. lower than the monaural threshold. 
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TABLE IiIl 


CoMPARISON OF BINAURAL THRESHOLDS WITH MONAURAL THRESHOLDS FOR AVERAGE OF ALL 
Ears AND FoR Ear Fach O 1n oF DEcIBELS 
ATTENUATION BELOW AN ARBITRARY VOLTAGE 


Bi ral Average Diff Better Diff 

100 26.0 22.2 3.8 24.0 ,. 2.0 
250 55-5 50.2 5-3 51.7 3.8 
500 66.1 61.9 4.2 63.2 2.9 
750 70.1 66.3 3.8 67.7 2.4 
1,000 71.6 68.4 3.2 70.3 1.3 
1,500 70.9 67.7 3.2 69.8 1.1 
2,000 72.2 68.5 3-7 70.3 1.9 
2,500 72.5 69.6 2.9 72.7. —0.2 
3,000 73.6 71.8 1.8 74.2 —0.6 
4,000 69.5 65.8 3-7 69.0 0.5 
5,000 66.9 63.0 3.9 1.4 
O00 67.5 62.6 4.9 65.4 2.1 
8,000 53-2 47-7 5.5 51.2 2.0 
10,000 34.5 28.2 6.3 33-8 0.7 


More important, perhaps, from the theoretical point of view is the 
fact that the binaural thresholds are also quite consistently lower 
than even the better ear. It should again be mentioned that the 
sensitivity of the better ear is overestimated by this procedure. In 
spite of this fact, there is still a difference of between one and two db. 
between the better ear and the ‘normal’ binaural threshold. In the 
face of these results it can hardly be maintained that the binaural 
threshold is simply the threshold of the better ear. 

It will assist, perhaps, in clarifying the relationships expressed in 
these tables if all of the results are expressed in common terms. It 
will be simplest to transform all the results into terms of sensation 
level, where the S.L. for any given frequency is the ratio in db. of any 
given sound pressure level to sound pressure level necessary to reach 
the average monaural threshold. In these terms, then, an averaging ; 
of all the results for the 14 frequencies tested results in the following © 


comparisons: 
Threshold of ‘worse’ ear 2.5 db. S.L. 
Threshold of average ear 0.0 db. S.L. 
Threshold of ‘better’ ear — 2.5 db. S.L. 
‘Normal’ binaural threshold — 4.0 db. S.L. 


EXPERIMENT II 


The obvious limitation in these results proceeds from the fact 
that the two ears are rarely if ever equated even in otherwise quite 
normal Os. On the other hand, it seems reasonable to suppose that 
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the maximum summation would take place if the stimuli presented 
to the two ears are functionally equal. In this case the binaural 


threshold should be even further below either of the monaural 
thresholds. 


Procedure: 


A quite different procedure was adopted, therefore, in the second experiment. Essentially, 
three different attenuators were provided, one for each of the two ears separately and a third 
which controlled the signal at a common point before it was divided between the two channels. 
The O was required, at one session, to find the monaural threshold first for one ear, then the 
monaural threshold for the other ear, and finally, with an amount of attenuation set into the 
path to each ear proportional to its sensitivity, to determine a binaural threshold by varying 
the signal to the two ears together. Thus, with the two ears equated in the preliminary tests, 
there is no longer a better ear and a worse ear but, within the limits of experimental error, two 
equal ears contributing equally to the binaural effect. 


ATTENUATOR (| HEADSET 
Aa 
MOT 
ATTENUATOR FORMER A 8 
UA 
— - 
! OBSERVER’S 
1 _ CONTROL crrcuIT SWITCH 


Fic. 2. Block diagram of equipment for Experiment II 


The actual arrangement of the equipment is shown in Fig. 2. A novel feature of this set-up 
was the provision of a motor-driven attenuator. The O was able to raise or lower the intensity 
of the tone simply by pressing his key in one direction or the other. Psychologically the device 
has many of the advantages of a method which gives the O control of the stimulus. It denies 
him, however, any substantial knowledge of his performance, a factor which sometimes subtly 
influences even the most conscientious. The other parts of the equipment were the same as 
those used in Experiment I. 

In detail, the procedure used was as follows: Earphone B was disconnected and attenuator 
B was terminated in an equivalent resistance. Twenty db. of attenuation were set in attenuator 
A. The O operated the motor-driven attenuator until he had reached threshold. The resulting 
attenuation, less a constant 10 db., was added to the setting on attenuator A. The process was 
then repeated with earphone A disconnected, attenuator A was terminated in a resistor, and 
earphone B was reconnected. Finally, with the two corrections added to A and B respectively, 
both earphones were reconnected and the binaural threshold was determined, again by means 
of the motor-driven attenuator. All the Os had previously had experience in determining thresh- 
olds and all except three had used the motor-driven attenuator. The Os all agreed in preferring 
this device over the hand-operated attenuator which it supplanted. 

The two monaural and one binaural thresholds were determined for each frequency before 
proceeding to the next frequency. The order in which the ears were tested, the initial test 
frequency, and the direction taken up or down the scale in selecting the remaining frequencies 
were determined by chance for each O. Thirteen Os were used. Two thresholds were deter- 
mined for each O at each frequency. ‘Two days elapsed between experimental sessions. 


Results: 


The results are expressed in terms of the amount by which the 
common attenuator is readjusted when the stimulus is presented 
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binaurally. They represent the difference between the equated 
monaural thresholds and the binaural threshold with the ears equated 
in sensitivity. The averages for the 13 Os for two determinations at 
each frequency are given in Table IV. 


TABLE IV 


DIFFERENCE BETWEEN BINAURAL THRESHOLD AND MoNAURAL THRESHOLD WHEN Two Ears 
ARE Previousty MatTcHED IN SENSITIVITY, ExPRESSED IN 
DeciIBELS OF ATTENUATION 


Frequency ny Standard Error 
100 3-9 45 
250 44 53 
500 5.6 35 
1000 3.5 .43 
2000 2.9 41 
3000 3-3 51 
5000 3.2 47 
8000 2.2 53 


The standard errors of each average have been calculated on the 
assumption that the 26 measures are fully independent. This as- 
sumption may overestimate slightly the actual variability. It is 
evident that there is almost no chance that these differences are due to 
random factors. On the other hand, the differences among the means 
as a function of frequency may very well not be significant, except in 
two or three extreme instances. 

Again it is interesting to express the binaural threshold in terms of 
the sensation level using the equated monaural thresholds as a 
reference point. If we select the directly comparable frequencies we 
get the following values: 


Experiment I Experiment II 
Monaural threshold Monaural threshold 
(average of all ears) 0.0 db. S.L. (equated ears) 0.0 db. S.L. 
Monaural threshold 
(better ear) —1.8 db. 
‘Normal’ binaural threshold —3.9 db. ‘Maximum’ binaural threshold 


with ears equated — 3.6 db. 


This comparison makes it clear that a substantial amount of summa- 
tion, 3.6 db., takes place when the ears are matched, and that, if 
one ear is more sensitive than the other, the apparent amount of 
summation becomes slightly greater, 3.9 db., simply because the more 
sensitive ear gradually becomes the determinant factor. “The amount 
of summation measured from the more sensitive of two unequal ears, 
2.1 db., is substantially less than the maximum summation measured 
with equated ears. 
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EXPERIMENT III 


The binaural hearing of speech depends on many factors other 
than the absolute threshold for pure tones. In the course, therefore, 
of a later study of binaural speech, it seemed relevant to find out 
whether the relations found in Experiment II would hold as well for 
speech as for tones. ‘Thresholds were determined for speech intel- 
ligibility using one and two ears under the conditions obtaining in the 
earlier experiment.? It should be emphasized that this study relates 
specifically to intelligibility and not to audibility. 


Procedure: 


The tests used for determining the threshold of speech intelligibility were the several lists 
of Psycho-Acoustic Laboratory Auditory Test No. 9 (4). Two lists of 42 words were each re- 
corded phonographically six times, each record having the words in a different order. On a 
given record each succeeding group of six words is four db. less intense, so that a list of 42 words 
covers a range of 24 db. The 84 bisyllabic words in Auditory Test No. 9 were chosen because 
of their familiarity and homogeneity with respect to audibility. Thus, the lists provide a fairly 
sensitive device for measuring speech thresholds (1). 

The records were played with a high-quality phonographic pick-up, and the speech, after 
suitable amplification and attenuation, was fed to two Permoflux PDR-10 earphones. A block 
diagram of the equipment is shown in Fig. 3. 
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Fic. 3. Block diagram of equipment for Experiment III 


Each S was given at least 12 individual tests, the first eight of which were presented to the 
right or left ear in RLLRRLLR or LRRLLRRL order. Amplifier gain was adjusted so that 
a test tone recorded at the same level on each test record produced the same voltage across the 
earphone for each test. (All threshold values are expressed in decibels re 1.0 v. of the test tone 
across the phone unless otherwise stated.) For testing the right ear, the left attenuator was 
set at 100 db., and the right attenuator at 10 db. The master attenuator and the number of 
words correct at each level on the record then determined the right threshold. The reverse 
procedure, of course, was used for the left ear threshold. 


Results: 


The thresholds given in Table V are mean thresholds for four 
tests on each ear of Ss W, G, and I, and five tests on each ear of Ss 
J and C. 


2 Speech intelligibility threshold is here defined as the level at which an S hears correctly 
50 percent of the words presented. . 
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TABLE V 


Mean MonauraAt THRESHOLD VALUES FOR SPEECH INTELLIGIBILITY 
EXPRESSED IN DB. r¢ ONE VOLT 


Right Threshold (db.) S.E. Left Threshold (db.) S.E. 
W — 113.5 65 —112.8 1.91 
G — 110.3 .48 1.19 
I — 108.3 .48 — 108.5 29 
J —I115.0 85 —115.8 1.20 
— 109.4 75 — 108.0 45 


Where the two ears showed a significant difference, it was neces- 
sary to ‘equate’ the two ears before permitting binaural listening. 
The method of equating may be shown if we consider a specific case. 
Subject G’s mean right threshold was — 110.3 db. (re one volt) and 
his left was — 111.5. Sensitivity for speech was then approximately 
one db. greater in the left ear than the right. If binaural listening is 
then accomplished with the level in the right phone one db. greater 
than that in the left, we assume that the stimuli presented to both 
ears are functionally equal. For example, if the level in the right ear 
of this S is — 110.4 db., and in the left — 111.4 db., both ears are 
zero db. sensation level referred to the respective monaural thresh- 
olds. (Subject I’s two ears were approximately the same, and so his 
two ears were considered to be equated when the voltage across the 
two phones was the same.) If the binaural threshold were not lower 
than the monaural, i.e., if there were no summation, we should expect 
Subject G to hear 50 percent of the words correctly at this level. 
This is not the case, however, as is shown in the results for all listeners 


TABLE VI 


Mean BINAURAL THRESHOLD VALUES FOR SPEECH INTELLIGIBILITY 
EXPRESSED IN DB. r¢ THE EQuaTED MonaurRAL THRESHOLD 


Equated Monaural 


Thresholds Binaural Threshold S.E. 
W fe) —3.7 db. .48 
G —3.9 
J —2.5 124 
C —2.8 0.0 


in Table VI. Each threshold given is a mean of four tests on each 
S. Values are expressed in db. below the equated monaural 
threshold. 

The mean binaural threshold for the five Ss was — 3.2 db. sensa- 
tion level. This corresponds fairly well with the difference noted in 
the previous experiment for pure tones at 1000 c.p.s. and above. It 
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should be noted again that the difficulty when comparing a binaural 
threshold to the ‘better’ ear or the ‘average’ ear is herein avoided so 
long as the stimuli impressed on both ears are functionally equal. 


SUMMARY 


The absolute threshold for pure tones is lower when both ears are 
stimulated than when either ear is stimulated alone. The normal 
summation at threshold is a function of the relative sensitivity of the 
twoears. Fora group of listeners with substantially normal hearing 
in both ears, the binaural threshold is from one to two db. lower than 
the best monaural threshold. 

Somewhat greater summation can be shown if the two ears are 
arbitrarily equated by first determining their respective monaural 
thresholds and then decreasing the tone proportionately below the 
threshold of each. The maximum summation present in this case is 
about 3.6 db. 

There is limited evidence that the amount of summation may be 
greater at some frequencies than at others. This finding, however, 
should not be accepted without further check. 

The binaural threshold for speech intelligibility behaves in the 
same manner as does the binaural threshold for pure tones. When 
the two ears are presented with speech which is functionally equal in 
intensity, the binaural threshold is approximately three db. lower 
than that of either ear alone. 


(Manuscript received July 25, 1946) 
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THE ILLUSTRATION OF THE HORIZONTAL-— 
VERTICAL ILLUSION 


BY FRANK W. FINGER 
University of Virginia 
AND 


DAVID K. SPELT 
University of Florida 


INTRODUCTION 


Textbook writers and classroom demonstrators are frequently 
hard pressed to find material that will illustrate a given psychological 
principle strikingly, conveniently, and unequivocally. It sometimes 
happens that an example will creep into the folklore of the profession 
and persist unchallenged for many years in spite of the fact that its 
success depends upon the operation of unrecognized factors that in 
reality vitiate its alleged significance. The common use of the in- 
verted T to illustrate the horizontal-vertical illusion is a case in point, 
as this report will testify. It will be shown that in this figure the 
horizontal line tends to be perceived as the shorter not just because 
it is horizontal but also because of its bisection by the vertical line. 

Fick (2), writing in 1851, is usually given credit for first calling 
attention to the discrepancy in horizontal and vertical judgments. 
In what was probably the first systematic investigation of this general 


error, he demonstrated that “. . . a bright square on a dark ground 
looks like a vertical oblong . . .” (2, p. 83). A little later Wundt 
(21) and Helmholtz (4) concurred, observing that “‘. . . we are dis- 


posed to regard vertical lines as being longer than horizontal lines of 
the same length” (4, p. 170). James (5, p. 264) illustrated the 
principle with both a square and a cross, and Sanford (15, p. 249) 
added the L form. In Titchener’s Experimental psychology (17, p. 
160) these figures were supplemented by the inverted T, and a few 
years later Ladd and Woodworth (6) presented both an L with the 
limbs slightly separated and the inverted T to demonstrate that 
“Vertical distances are perceived as greater than mathematically 
equal horizontal distances” (p. 437). Pillsbury (10, p. 305) re- 
produced Titchener’s sketches, including the inverted T, and then the 
trend in textbooks swung toward reproducing just the inverted T or 
a modification thereof (1, 3, 8, 13, 16), with few (14, 20) making use of 
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the less striking but perhaps more legitimate cross or L. It has of 
course been noted that a single division of a line shortens its apparent 
length (17, p. 157), but seldom (15, p. 249) has the interaction of this 
effect with the horizontal-vertical illusion been acknowledged. So 
far as we have been able to ascertain, only one brief experimental 
report (9) has mentioned that the inverted T is an illustration of 
something more than the horizontal-vertical illusion. 


PROBLEM 


The hypothesis underlying this experiment is that one significant 
factor contributing to the faulty perception of the length of the lines 
in the inverted T is the division of the horizontal line by the vertical 
line. In other words, the figure is more than an illustration of the 
horizontal-vertical illusion as originally described, for the perceived 
shortness of the horizontal line is due only in part to the different 
planes of the two lines. To separate and vary these two factors 
systematically four figures were compared (see Fig. 1): A, the L form, 
with the lines slightly separated; B, the same L rotated clockwise 
through 90°; C, the inverted T; and D, the inverted T rotated clock- 
wise through go®. In A and B the classically-assumed factor is 
acting alone to produce the linear illusion. In C there is added to 
this the factor of line-division. In D the two factors are acting in 
opposition. 

On the basis of the hypothesis several testable predictions may 
be made. 


1. The error of judgment in C will be greater than that in A, for 
the latter is determined by one tendency and the former by two 
cooperating tendencies. 

2. The overestimation of the vertical in C will be greater than 
the overestimation of the vertical in D. In C the two illusory trends 
are acting in the same direction, but in D they are opposed, tending 
to neutralize each other. 

3. The overestimation of the vertical will be greater in B than 
in D. In this case the effect of one tendency (in B) will be compared 
to the effect of two antagonistic tendencies (in D). 


APPARATUS AND PROCEDURE 


Each figure consisted of a background against which were seen a standard line of fixed 
length and perpendicular to it a variable line which was to be equated to the standard by the 
subject (see Fig. 1). The ground was formed by a sheet of pressed wood, 4 ft. X 5 ft., with a 
glossy white finish. The standard line was a steel tape } in. X 38 in., painted glossy black, 
parallel to one short edge of the board and held in close contact with the board by having each 
end passed through a slot (indicated by S in Fig. 1) in the board and secured to the back surface. 
To form a variable comparison line, another steel tape, } in. X 84 in., was painted black for 
part of its length and white for the remainder. The end of the black section passed to the back 


of the board through a slot near the fixed line, and the end of the white section through a second 
slot 48 in. away.! In this manner 48 in. of the tape was always in front of the board (the rest 
hidden), but by pulling one end or the other the amount of the black section visible could be 
varied. Fastened to the black end of the steel tape was a black cloth tape, with a white cloth 
tape similarly continuous with the other end. Each end of this cloth-steel tape passed over a 
plastic bearing secured to the edge of the board, and extended to the S seated 20 ft. in front of 
the board. 

To the S, the black steel tapes appeared as black lines on a white surface, but the white 
section of the variable tape was scarcely distinguishable from the background. Thus, as one 
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Fic. 1. Scale drawings of the apparatus for investigating the horizontal-vertical illusion. 
A and B differ only in their position of presentation; similarly, D is obtained by rotating C 
clockwise through 90°. The solid lines within each of the four figures represent black tapes 
(4 in.), with the white section of the variable tape represented by the dashed line. The pro- 
portions of black and white sections can be altered by the subject. S indicates the slots by which 
the tapes pass to the back of the boards. 


1 This figure is applicable only to the board forming figures A and B. In the other board, 
the slots were 54 in. apart. 
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or the other cloth tape was pulled, the variable black line was seen to change in length, with a 
constant point of origin near the standard 38 in. line. In the shaded experimental room, fluo- 
rescent lighting of the board minimized interference by shadows and highlights. 

Two boards were used in the experiment, each presented in two positions. Fig 1A is the 
separated L form; the vertical black line at the left can be varied in length by shifting its upper 
extremity. In Fig. 1B there has been a go®° clockwise rotation of the apparatus, so that the 
variable line is now horizontal at the top. Fig. 1C is the inverted T. Its base is a constant 38 
in., the vertical bisecting limb adjustable in its upward extent. Fig. 1D represents a 90° clock- 
wise rotation, bringing the fixed line into vertical position at the left with the variable line ex- 
tending horizontally toward the right. 

Before the experimental session proper began, the binocular visual acuity of the S was 
tested with the Snellen chart, glasses being worn if customary. The apparatus was then ar- 
ranged to present the first figure. The £ read the initial instructions, allowing the S to try out 
the manipulation of the variable line. 


This is the first of four different problems in visual perception that you will be given. 
As you see, it consists of two black lines [point]. The length of one can be altered [demon- 
strate and have S manipulate]. I want you to change the movable line so that it seems 
exactly the same length as the fixed line. This is not a trick or a test, and there is no right 
answer. Simply set the movable line so that the two seem to be of the same length. I 
shall ask you to do it several times, but always make each judgment without trying to 
compare it with what you did the previous time. 


Ten determinations were made by the method of average error, with the S asked to close his 
eyes between trials while the measurement was made and the variable line readjusted. The 
other three figures were then successively presented for ro trials each, brief appropriate instruc- 
tions accompanying each. The four figures were given to the group of Ss in a balanced order, to 
equate possible learning effects. 

Following the 40 trials, a standard set of questions was asked to ascertain the S’s information 
concerning perceptual matters in general and this illusion in particular. 

Seventy-two college students served as Ss. Of these Ss 50 were women, ranging in age 
from 17 to 20 years (M = 18.7 years), and 22 men, 16 to 24 years old (M = 19.3 years). 


RESULTS 


Under the conditions of illumination used, only two Ss showed 
binocular visual acuity as poor as 20/40. The great majority tested 
20/20 or better. All were able to perceive the figures readily as black 
lines against a white background. In no case were the answers to 
the concluding questions regarding knowledge of the illusion such as 
to require elimination of the S’s judgments. 

For each of the four figures, the median of the S’s 10 judgments 
(recorded to the nearest } in.) was taken as the significant measure 
of his responses.?, As Table I shows, the horizontal-vertical illusion 
was experienced by 35 of the Ss in figure A. The average shortening 
of the variable vertical line below equality (38 in.) for all 72 Ss was 
0.39 in. (1.0 percent). In B the effect was considerably greater: 
64 Ss lengthened the horizontal line beyond 38 in., and the average 
of all 72 was 3.22 in. (8.5 percent). The discrepancy between the 
perception of the two lines in C was intermediate in degree between 
that in A and in B. Fifty of the Ss set the vertical line below a 


? The mean deviations of judgments within the groups of 10 trials ranged from approximately 
2 percent to § percent, averaging about 3.3 percent. 
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Errors 1N CoMPARING THE LENGTHS OF HorIZONTAL AND VERTICAL LINES 


Number Number Mean Overestimation of Vertical (NV = 72) 
Figure Overestimating Underestimating 
Vertical Vertical 
Inches Percent 

A 35 31 0.39 1.0 
B 64 4 3.22 8.5 
C 50 19 2.72 7.2 
D 47 20 1.20 3.2 


reading of 38 in., and the average error of judgment of the 72 Ss was 
2.72 in. (7.2 percent). In D, 47 of the Ss saw the horizontal as the 
shorter of the two lines when they were actually equal, and the 
average S lengthened it to 39.2 in. (3.2 percent error) in ‘equating’ 
it to the 38 in. fixed line. 

Comparing in Table II the average errors of the perception in the 
four figures, we find the error in C to be 2.33 in. greater than that in 
A (prediction 1), the error in C to be 1.52 in. greater than that in D 


TABLE II 
DIFFERENCES AMONG THE Four Ficures tN MEAN OVERESTIMATION OF THE VERTICAL 
Comparison Prediction No. Difference (inches) t P 
C-A I 2.33 4-9 <.001 
C-D 2 1.52 2.6 .Ol approx. 
B-D 3 2.02 3-7 <.001 


(prediction 2), and the error in B 2.02 in. greater than that in D 


(prediction 3). 


The second of these differences is significant at ap- 


proximately the .o1 level of confidence by the t-test, while the other 


two are significant beyond the .oor level. 


The group averages thus 


confirm all the predictions deduced from the original hypothesis. 
Because group averaging of data sometimes distorts the actual 

relationships obtaining in the individual Ss, the three predictions 

were checked in each of the 72 Ss. 


Table III summarizes this 


TABLE III 
InpivipvAL ConFIRMATIONS OF THE Prepictions (NV = 72 X 3) 

Numbe M Numbe M 
I 47 4.15 19 —1.4 
2 42 5.62 30 — 4.34 
3 48 3-95 24 —1.72 

Total 137 4.5 73 —2.7 


j | 
| ‘ 
, 


248 FRANK W. FINGER AND DAVID K. SPELT 


treatment. Prediction 1 (that the error in C would be greater than 
the error in A) is confirmed in 47 Ss (mean difference = 4.15 in.), 
and refuted by 19 (mean difference = — 1.4 in.). Prediction 2 (that 
the error in C would be greater than that in D) is confirmed by 42 
(average difference 5.62 in.) and refuted by 30 Ss (mean difference = 
— 4.34in.). Forty-eight individuals followed the trend of prediction 
3 (that the error in B would be greater than that in D) with a mean 
difference of 3.95 in., while 24 did not (mean difference = — 1.72 
in.). Inasmuch as there were 72 Ss involved in each of three predic- 
tions, we may say that there are 216 separate tests of the hypothesis. 
Of these 216 tests, 137 favor the hypothesis (mean of 4.5 in.) and 73 
tend to refute it (mean of — 2.7 in.). Such a distribution differs 
from chance expectation beyond the .oo1 level, according to the chi- 
square treatment. Finally, analysis reveals that for 45 Ss at least 
two of the three predictions were borne out, and only in 25 were two 
or more of the predictions controverted. 


DIscuUsSSION 


There is little question but that the findings of this experiment 
confirm the original hypothesis that the perception of the inverted T 
is in part a function of line bisection. Each of the three predicted 
trends appeared, demonstrable both by comparing the mean group 
judgments in the several figures and by contrasting the distribution 
of individual judgment-differences against the distribution attri- 
butable to chance (and the horizontal-vertical illusion) alone. By the 
latter method, not only did the number of individuals reacting in the 
direction favorable to the hypothesis significantly outnumber those 
acting counter to the hypothesis, but the average favorable judgment 
was 67 percent greater in magnitude than was the average contrary 
judgment. 

On the other hand, certain of the data may at first glance be 
somewhat disappointing. To be sure, the differences examined 
statistically are great enough to be considered reliable, according to 
formal standards. Yet there are numerous exceptions to the pre- 
dicted trends. For example, 30 Ss went counter to the expectation 
stated in Prediction 2. 

These exceptions, however, are not surprising in the light of the 
complexity of this perceptual situation. The responses of the Ss are 
determined by more than just the two factors with which this 
experiment has been primarily concerned. Earlier experiments have 
revealed additional factors that might be expected to distort the 
expression of these two (cf., e.g., 11). The S’s attitude and the time 
allowed for judgment both are relevant parameters (7, 19). The 
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illusory effect has been shown to decrease when the standard line is 
increased beyond a certain length (18). The ‘overhanging’ vertical 
(i.e., Fig. 1B) produces in many Ss a greater error than does the 
‘standing’ vertical (i.e., Fig. 1A) (18). It has been demonstrated 
that the comparison of two lines is complicated by the tendency to 
make a variable line longer than a standard line, whatever their 
relative positions (12). It may have been the action of this last 
factor that made the confirmation of Prediction 2 the least satis- 
factory of the three. In view of such complications, the surprize 
should perhaps be that statistically significant trends in the predicted 
directions could nevertheless be discerned. It would be interesting 
to attempt a still clearer demonstration of our hypothesis by con- 
trolling some of these obscuring elements. 

That the horizontal-vertical illusion can be demonstrated is of 
course not controverted. ‘The median judgment of 8g percent of the 
Ss was that the vertical in Fig. 1B was longer than the actually equal 
horizontal, with the mean illusory effect for these Ss amounting to 
10 percent. But the data still more strongly support the original 
hypothesis of this study, that in the inverted T the linear judgment 
is distorted by something more than simply the direction of the lines. 
The perceived shortness of the horizontal line is in addition a distinct 
function of its bisection by the vertical. It might then be argued 
that the unqualified presentation of the inverted T as an illustration 
of the principle that ‘‘Vertical distances are perceived as greater than 


mathematically equal horizontal distances” (6, p. 437) is open to 
serious objection. 


SUMMARY 


Seventy-two Ss attempted to equate in length the horizontal 
line and the vertical line in each of four figures. In one figure (C) 
the horizontal line was bisected by the vertical, in the second (D) 
the vertical was bisected by the horizontal, and in the other two (A 
and B) the lines did not intersect. The three predictions underlying 
the experiment were clearly borne out: 


1. The error in C was greater than the error in A; 
2. The error in C was greater than the error in D; 
3. The error in B was greater than the error in D. 


The hypothesis on which these predictions were based was thus 
confirmed: that the error of perception in the inverted T figure is an 
illustration not simply of the horizontal-vertical illusion, but of the 
‘bisected-line’ illusion as well. 


(Manuscript received July 24, 1946) 
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THE INFLUENCE OF AMOUNT OF PRACTICE UPON 
THE FORMATION OF A SCALE OF JUDGMENT * 


BY M. E. TRESSELT 


New York Untoersity 


INTRODUCTION 


Whenever a group of individuals are asked to make judgments 
upon objects or statements, there are three possible processes which 
might be used. The judgments might be formed according to some 
previously existing scale without change; the old scale of judgment 
might shift upward or downward to correspond with the range of the 
new stimuli; or, in the event that there is no appropriate old scale, 
an entirely new scale might be established. Ordinarily an individual 
brings with him to the task of judgment a scale which will not 
correspond with the new stimuli, so that the old scale must shift or 
give place to a new one. 

When different individuals are faced with a new task of judgment, 
their initial responses are widely distributed; but if these individuals 
continue to be exposed to the same stimuli, it has been found that 
their judgments will approach agreement. ‘This result has been re- 
garded as uniform opinion formed under conditions which are 
primarily non-social (9). 

The principal experiments on the production of uniform opinion 
have been made under conditions of social stimulation and are best 
exemplified by the studies of Sherif (7, 8). He has shown that a 
convergence of judgments can be produced by the social interaction 
of groups of individuals who are making judgments on the same 
stimulus-objects. Since the voiced judgments appear to have a 
primary influence upon the responses, these experiments and similar 
experiments are ordinarily classified under the label ‘suggestion,’ 
which implies that the formation of uniform opinion requires some 
kind of social stimulation. 

Whether the scales of judgment are adjusted to the new range of 
stimuli presented, or whether the scales of judgment shift in ac- 
cordance with the judgments given by another individual, the 
processes seem to be similar, i.e., the scales have been built up from 
experience with stimulus-objects or statements in the environment. 
On this basic assumption it was decided to examine the effects of a 


* The author wishes to thank Dr. John Volkmann for his advice and constructive criticisms. 
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previously formed scale upon a new scale and the effects of differing 
amounts of practice with that pre-existing scale upon the new scale. 
It seems reasonable to expect that continued practice with one range 
of stimuli will affect the judgments on a shifted range of stimuli. 
Ansbacher (1) has already reported that when American and Cana- 
dian two-cent and three-cent stamps are judged on the basis of 
number and of value, there is a tendency for familiarity to influence 
the perception. He states that ‘“‘where a stamp has been accepted 
as the stamp of one’s own country, its value aspect has been in- 
teriorized as a frame of reference” and “the influence of this frame 
is such that it makes the more valuable stamps appear also more 
numerous” (1, p. 350). 

Related to the effect of a previously formed scale is the experi- 
ment on anchoring reported by Rogers (6), who used the method of 
single stimuli. In the first session Rogers called for judgments in six 
absolute categories without showing any anchoring stimuli, 1.e., 
stimuli which designate the position of one or more categories of the 
scale. In the next session an anchoring stimulus was presented just 
prior to each stimulus to be judged. ‘The results of this study show 
that as the anchoring stimulus moves farther from the range of 
judged stimuli, it expands the scale of judgment. There is, however, 
a limit beyond which the scale will not continue to expand uniformly. 
The scale therefore is assimilated to the shifts of the anchoring 
stimulus. 

The procedure of presenting the anchoring stimulus before the 
stimulus to be judged resembles the work on the time-error phe- 
nomenon. ‘The term ‘anchoring’ as used here, and the term ‘assimi- 
lation’ as used by Lauenstein (4), could be interpreted as being 
similar in function. Lauenstein states that if stimuli which are 
greater or less than the standard are interpolated between the 
standard and comparison, the time-error will be positive or negative 
correspondingly. The suggestion might be made at this time that 
the interpolated stimulus was acting as an anchoring stimulus. 
Tresselt (10, 11) attempted to show that even though there are no 
interpolated stimuli the actual frame around visual stimuli-might act 
as a frame of reference. From the data she formulated the following 
hypothesis: ‘‘If the materials are unstable or heterogeneous they do 
not lend themselves to assimilation and the negative time-error will 
result. If the materials are stable or homogeneous, they lend them- 
selves to assimilation which will tend to give either the positive or 
negative time-error according to the relation of the background 
material to the standard” (11, p. 30). 

It happens that the judgments of an individual are not determined 
exclusively by the stimulus of the moment, but by a series of stimuli 


A 


JUDGMENT SCALE AS FUNCTION OF PRACTICE 253 


which may be separated, at least by minutes, if not hours or days, 
from the immediate objects of judgments. In the: iim of the time- 
error Pratt (5) suggests that the constant errors may be a result of 
judgments based on a level of reference built up by the observer in his 
past experience. In partial substantiation of this hypothesis, ‘Tres- 
selt and Volkmann (9) have data which show that one S who worked 
in a steel plant held the weights to be light or medium for the first 
eight judgments, while a college professor held all the weights to be 
heavy or medium except one, the very lightest. 

Since there is a paucity of information on the effect of a shifted 
scale, it seems advisable to determine the effect of experimentally 
produced variations in the amount of past experience (with stimulus- 
objects similar to the objects to be used as the new task of judgment) 


upon the speed with which the Ss approach agreement in their task 
of judgment. 


PROCEDURE 


The stimulus-objects were 12 weighted cylindrical cardboard containers 6.2 cm. in diameter 
and 3.8 cm. high. Their weights in gm. were 11, 60, 110, 160, 210, 260, 310, 360, 410, 460, 510, 
and 560. The weights were separated by units of 50 gm., except for the first and second weights, 
in order that a relatively large range of weight be covered by a convenient number of stimuli. 

There were six groups of Ss. Groups I-V were given practice on each of the four heaviest 
weights, i.e.,.410, 460, 510, and 560 gm., in varying degrees, so that Group I (N = 36) was 
given one practice trial on each of the four weights; Group II (N = 36) was given four practice 
trials on each weight; Group III (N = 36)-was given eight practice trials; Group IV (NV = 48) 
was given 12 practice trials; and Group V (N = 24) was given 28 practice trials. Group VI 
(N = 24) on the other hand was given 12 practice trials on the four lightest weights, i.e., 11, 
_ 60, 110, 160 gm. 

The Ss were experimented upon individually. After giving instructions about the grasping 
and lifting of the weights, E told S to judge whether the weight felt heavy, medium, or light. 
The S was then blindfolded and given the practice trials. When these trials were completed and 
five min. had elapsed, the experimental series of all 12 weights was presented. The order of 
presentation for this part of the experiment was pre-determined in such a way that each weight 
was presented in a different position to each S. In this way, 12 Ss had each of the weights in a 
different position in the series; that is, the 11 gm. weight would be the first for subject No. 1, the 
second weight for subject No. 2, the third for subject No. 3, etc. 


Discussion AND RESULTS 


Table I presents the data in terms of the mean weight judged 
medium, the frequency of medium judgments, and the standard 
deviation of the weights called medium, as functions of the serial 
position of presentation for all the experimental groups. ‘The mean 
weight judged medium in the 6th position for Group II, for example, 
is 276.7, N = 6, and o = 81.6.! 


1The PSE (Point of subjective equality) was not a feasible statistic because judgments in 
the heavy category gave a most irregular psychometric function. The mean weight judged 
medium appeared to represent best the center of the scale of judgment. 
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TABLE I 


SHowinc THE Mean Weicut 1n Grams JupGep Medium (M), THE Frequency oF Medium 
JupGMENTs (n), AND THE STANDARD Deviation (¢), as A FUNCTION OF THE SERIAL 
Position For Groups I, II, III, IV, V, ann VI 


Group 
Serial 
Position I II III 
M n o M n o M n o 
I 376.7 6 85.0 | 443.3 9 94:3 | 447-5 8 69.9 
2 315.6 9 191.1 310.0 4 93-5 363.6 14 106.0 
3 395-7 7 118.7 325.0 10 11.8 322.5 8 123.1 
4 235.0 6 94.6 295-7 7 52.3 310.0 8 61.2 
5 285.0 10 103.1 248.9 9 10.5 265.0 10 96.1 
6 198.9 9 96.6 276.7 6 81.6 278.8 8 89.9 
7 302.9 7 77-6 | 267.1 7 94-2 | 297.5 8 9.9 
8 305.8 12 77.6 293.3 9 47.1 285.0 8 55-9 
9 245.7 7 39-9 270.0 10 29.8 282.2 9 58.3 
10 290.0 10 82.5 193.3 6 47.1 280.8 12 62.8 
II 297.5 8 77.4 255.8 12 65.8 300.9 II 63.3 
12 274.3 7 58.0 251.7 12 86.2 294.6 13 82.1 
IV 4 VI 
I 452.9 7 62.3 451.7 6 83.7 85.0 II 25.0 
2 346.4 I! 100.2 355.0 10 11.9 145.7 7 74.4 
3 299.3 14 9:7 | 360.0 5 104.9 | 122.5 4 54-5 
4 328.8 8 49.6 324.3 7 124.5 147.5 8 80.0 
5 256.7 18 105.4 326.7 6 84.9 160.0 8 61.2 
6 285.0 6 go. 245.7 7 112.5 143-3 6 48.9 
7 313.3 15 68.6 210.0 I 0.0 135.0 4 25.0 
8 298.9 9 65.7 269.1 II 89.4 174.3 7 51.5 
9 298.5 13 56.1 331.4 7 58.9 181.4 7 64.7 
10 277.7 17 80.1 210.0 5 89.4 178.8 8 82.7 
11 270.7 14 57-3 293.3 6 55-3 140.0 5 51.0 
12 272.5 12 7.4 278.8 8 65.8 185.0 8 55-9 
TABLE II 


SHowinc THE Mean WEIGHT In Grams JupGED Medium (M), THE Frequency oF Medium 
JupGments (n), AND THE STANDARD Deviation (¢), as A FUNCTION OF THE SERIAL 


PosiTIOoN FOR INDIVIDUALS WITHOUT PRACTICE TRIALS 


(Adapted from Tresselt and Volkmann) 


Position M n 
I 283.0 37 130.7 
2 243-3 30 127.5 
3 292.7 26 100.0 
4 250.3 31 73-0 
5 235-5 49 105.2 
6 211.6 31 90.8 
7 249.2 37 90.9 
8 254.6 37 
9 258.3 29 63.6 
10 237.0 37 59-5 
II 242.8 32 64.5 
12 228.4 38 73.1 
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The first problem posed for consideration was to see how the 
mean weight judged medium varies as a function of the serial position 
of the stimulation. Fig. 1 shows this relation for each of the groups 
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Fic. 1. The mean weight in gm. judged medium as a function of the serial position of the 
weights. The curves comprise the mean weight judged medium by Group I (with one practice 
trial on the four heaviest weights), by Group II (with four practice trials), by Group III (with 
eight practice trials), by Group IV (with 12 practice trials), by Group V (with 28 practice trials) 
and by Group VI (with 12 practice trials on the four lightest weights). 


I-VI. Each graph indicates that with continued stimulation and 
judgment the mean weight judged medium approaches the center of 
the stimulus-range, although the approach is not a smooth and 
steady: process. 
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Another possible interpretation of the curves is that there are two 
functions present rather than one, with a region of discontinuity in 
the vicinity of the 6th serial position. When Tresselt and Volkmann 
obtained similar data they suggested that the first portion of the 
curve represented the shift of the old scale and the second portion 
represented the construction of a new scale anchored by the new end- 
stimuli. In this experiment there is some accessory evidence which 
makes this view plausible. For example, about two-thirds of the Ss 
voiced their surprise when they were first given a weight lying well 
outside of the stimulus-range of the practice series. A few Ss stated 
spontaneously that the weights which were being presented must 
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Fic. 2. The mean weight in gm. judged medium as a function of the serial position of the 
weights for Group IV (with 12 practice trials on the four heaviest weights) and for Group VI 
(with 12 practice trials on the four lightest weights) 


belong to an entirely new group. In other Ss the recognition of a 
change was slower because of the conditions of the presentation; 
e.g., the Ss of Group I did not report any recognition of changed 
stimulation, probably because their practice series had been so short 
that they were prepared to receive heavier or lighter weights. 

In order to show the progression toward uniformity of judgments 
or opinion, as produced by uniformity of stimulation, the curves for 
Group IV and VI are placed together in Fig. 2. These groups had 
both been given the same number of stimulations (12) in the practice 
series, but Group IV had had the four heaviest weights while Group 
VI had had the four lightest weights. The convergence which occurs 
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may be said to be similar to convergence found by Sherif. ‘In 
Sherif’s experiments with autokinesis, when individuals first establish 
their frames and standards and are then brought into group situa- 
tions, their judgments tend to converge” (7, p. 3). The experi- 
mental series in this experiment could be thought to be analogous to 
the range of voiced opinions in the group situation. 

Change in the direction of uniform opinion, and the role of 
practice in producing that change, can be seen also in the SD of the 
weights called medium. In Fig. 3 the SDs of Group I-V are com- 


1204+ 


S.D. OF MEDIUM JUDGMENTS 


SERIAL POSITION 


Fic. 3. The SD of the distribution of medium judgments, in gm., as a function of the serial 
position of the weights. The solid line represents the findings of Tresselt and Volkmann (9); 
the dotted line represents the combined SDs of Groups I-V. 


bined and plotted as a function of the serial position. On the same 
coordinates are plotted the corresponding SDs taken from Tresselt 
and Volkmann.? In the first position the variation in judgment is 
smaller than it is when the Ss approach the experimental session 
without previous practice. The critical ratio of the difference be- 
tween the SD of the Tresselt and Volkmann study, first position, and 
the corresponding SD of this experiment is 2.5; the P-value lies 
between .or and .o2. ‘There is, then, a rise in the SD which might 
be interpreted as indicating either the fading of a memory trace or a 


2 This study (9) supplies what is in effect a control group. The Ss were of the same sex 
and about the same age; they were presented with the same series of weights under the same 
instructions and procedures. The only important difference is that the Ss of Tresselt and Volk- 
mann were not given a preceding practice series. 
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period of ‘reconnaissance,’ i.e., a period in which the S alters a previ- 
ous scale while he is attempting to construct a new scale. The fading 
of a memory trace would produce an increasing SD, as would the 
assumption of a new task by trial and error learning. The pooled 
data show a point of rise at the second position to meet the curve of 
the control group. The curves follow each other from ths point 
without any further significant differences. By the time the Ss have 
lifted approximately six weights, regardless of the amount of previous 
experience or kind of previous experience, the judgments are less 
widely distributed, and it might be said that the Ss have come to show 
a greater degree of conformity about what shall be called medium. 
This observation recalls a similar one in the field of learning; the 
effects of proactive inhibition are said to be discoverable only in the 
first trial, after which they disappear (12). 

Of primary importance, however, is the second purpose of this 
experiment, viz., to examine the effect of different amounts of practice 
with a pre-existing scale upon the formation of a new scale. The 
data as arranged in Table I show a definite effect of practice upon the 
first judgment of stimuli from the new, expanded stimulus-range. 
In this first serial position, the Ss of Group I show the lowest mean 
weight judged medium, 376.7 gm.; Groups II and III are higher, at 
443.3 and 447.5 gm., respectively; Groups IV and V are still higher 
(although the difference is small), at 452.9 and 451.7 gm. ‘The 
larger the period of practice, the more slowly does the scale of judg- 
ment shift to its new position. The difference between the mean of 
451.7 for Group V and the corresponding mean obtained by Tresselt 
and Volkmann has a critical ratio of 3.4, the P-value being significant 
at the .o1 level; the difference between the mean of Group I and the 
mean found in the study by Tresselt and Volkmann has a non- 
significant critical ratio of 1.85 (P-value between .10 and .50). The 
effect of differing amounts of practice, although real, would seem in 
this case to be also small; the means of Groups I-V do not differ 
significantly among themselves since the largest CR (Group I and 
Group V) is only 1.53. 

It is not easy to discover whether different amounts of practice 
exert different effects on the second judgments and subsequent ones. 
The effects, if they exist, are obscured by the oscillatory action of the 
mean medium judgments. The mean weight judged medium of 
Group I, for example, lies below the mean of Group IV in positions 
1 and 2, above in position 3, below in 4, above in 5, below in 6 and 7, 
above in 8, below in 9, above in 10 and 11, below in 12. One might 
conjecture that the scales of Group I and Group IV are oscillating 
with the same period (about 3 stimulations, approximately) and 
that the scale of Group I, having moved downward more rapidly in 
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the first position, is leading in phase throughout. The group which 
had the greatest number of stimulations before being presented with 
the experimental series shows the least oscillatory movement, in fact 
no movement at all, for the first four stimulations, then it begins to 
move up and down in much the same manner as the other groups. 
This tendency in the data suggests that Group V has almost no 
change in the scale of judgment but after about four stimuli have 
been exposed the scale formed by the practice series seems untenable 
and a new one is created. This suggestion resembles the finding by 
Sherif (8) that individuals tended to preserve an established scale 
over a period of several sessions. Since there are only group data 
available in this experiment, however, a close analysis of these 
oscillatory phenomena can hardly be made. 

It seems possible to hold a scale so firmly established, so over- 
learned, that it will not shift appreciably throughout the experimental 
series. ‘This possibility is exemplified by the one S who belonged to 
a club in which the recreation involved lifting weights of 100 to 300 
pounds, and who called all of the weights light. ‘Throughout the 12 
stimulations of the experimental series, he did not use the categories 
of medium or heavy once. 

Everyday life seems to present examples of shifts in absolute 
scales which probably are brought about by a change in the stimulus 
situation, although it has been suggested concerning social move- 
ments that some old scales may not shift because they are firmly 
anchored (2). The strength of anchoring, however, presumably 
varies with the manner in which scales have been formed. One 
characteristic of the data is outstanding: the rapidity with which the 
scale of judgment shifts. But a scale may undergo a major shift 
for any of several reasons: (a) because a new series of stimuli is being 
repeatedly presented, as in the present study; (b) because of addi- 
tional anchoring agents involved by instructions, as in Roger’s 
experiment (6); (c) under the influence of the voiced judgments of 
other people, as in Sherif’s original experiment. Perhaps the rapidity 
of shift would vary most widely with the emotional or motivational 
conditions of the judgment. A single stimulation of traumatic 
intensity might force a very rapid shift; motivated attachment to an 
old scale might impede the shift. It will be left to further research 
to explore these possibilities. 


SUMMARY 


Since an individual ordinarily brings an old scale with him to a 
task of judgment, it was decided to examine the effects of a previously 
formed scale upon a new scale and the effects of differing amounts of 
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practice with that pre-existing scale upon the new scale. Two 
hundred four individuals were called upon to make judgments upon 
a practice series of weights which consisted of four heavy weights 
or four light weights, and then were. given an expanded scale of 
weights. 

The major results suggest that: 


1. The center of the scale of judgment for the expanded range of 
stimuli will be at first in the direction of the center of the previously 
existing scale, and then, as the scale approaches the stimulus-range, 
begins to oscillate. 

2. Over a period of time the scales rapidly conform to the stimulus- 
range regardless of whether or not the practice series contained 
stimuli at the top or bottom of the expanded scales. 

3. There is a definite effect of different amounts of practice upon 
the first judgment of stimuli in the expanded stimulus-range. The 
greater the amount of practice, the more slowly does the scale of 
judgment shift to its new position. 

4. Whether there is a prolonged effect of the practice series upon 
the center of the new scale is questionable, since the oscillations in the 
categories of judgment confound analysis at this time. Since one 
individual did not change the center of his own well-established scale 
throughout the experiment, the suggestion is made that the greater 
the period of practice the slower will be the change of the pre-estab- 
lished scale to conform to the new range of stimuli. 


(Manuscript received July 19, 1946) 
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A COMMENT ON ‘AN ATTEMPT TO CONDITION 
THE CHRONIC SPINAL DOG’ 


BY P. S. SHURRAGER 
Illinots Institute of Technology 


With the consent of the authors, Kellogg, Deese, Pronko, and Feinberg, 
the editors sent me a copy of ‘An attempt to condition the chronic spinal 
dog’ (1) and, since the conclusions presented therein are not in accord with 
those arrived at by Dr. Culler and myself (3), asked if I cared to append a 
comment. 

The original spinal conditioning experiments reported by Shurrager and 
Culler dealt almost exclusively with acute spinal preparations. My own 
efforts to demonstrate conditioning in chronic spinal dogs were unsuccessful. 
The drastic nature of the acute operative procedures, which generally 
included artificial respiration, made virtually impossible keeping animals for 
more than a few days. However, the characteristic spinal CR could be 
extinguished and re-established in several acute preparations which lived 
into the third day of testing. In these preparations the CR was still stable 
and predictable. If the CR was established when training was discontinued, 
it was present at the first trial even if a rest period of several hours in- 
tervened. 

In the course of conditioning work with acute spinal dogs, over 500 
preparations were observed. About half of these were used as controls 
to test the validity of the spinal CR and to determine whether or not it met 
four criteria of conditioning: (1) that the CR was not evoked by the CS 
prior to conditioning (even in 1000 trials at 15-sec. intervals with double or 
triple CS stimuli spaced one sec. apart); (2) that the CR appeared and 
became firmly established as conditioning was continued; (3) that the CR 
was extinguishable; and (4) that the CR was not subject to spontaneous 
recovery. The phenomena described in ‘Conditioning in the spinal dog’ 
(3) conformed to these criteria and may therefore justifiably be called 
spinal conditioned responses, rather than striated components of the startle 
response facilitated by reflex sensitization, as Kellogg et al. suggest. Since 
the results reported by these authors do not conform to the above criteria, 
they are, I agree, bilateral reflexes rather than conditioned responses. 

It was further determined (3) that the spinal CR differs from a spinal 
reflex in that the stability of the CR response is not dependent upon the 
intensity of the stimulus, but rather upon the number of conditioning trials. 
The CS used was a moderate shock of short duration (strong enough to 
cause a tail tip movement indicating that the CS was received) or the tap 
of a stiff bristled brush applied to the tip of the tail (3, p. 159). The 
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intensity of this stimulus could be gradually increased without increasing 
the resultant CR, although when the CS stimulus became too strong, 
generalized bilateral reflex responses occurred. These typical generalized 
reflex responses which increase in intensity as the stimulus increases in 
intensity are characteristically exaggerated in the chronic spinal animal 
when it is similarly stimulated. ‘The generalized reflexes obscure the CR. 

The CR was carefuly observed through a binocular microscope in order 
to discern its extent. At all times the response was either subject to close 
inspection or recorded by balanced levers attached to the fascia immediately 
over the conditioned muscle fibers. Less than a half-dozen acute spinal 
preparations were observed in which conditioning spread beyond the in- 
sertional pit or insertional quarter of the Semitendinosus muscle, even when 
conditioning trials ran well over 1000 in a 24 to 28 hour period. 

Kellogg et al. have been scrupulous in pointing out the differences be- 
tween their technique and that described in (3). Of these, the most effective 
in contributing to divergent results are probably (1) the unfortunate choice 
of the contralateral leg as the locus of the CS, which introduces into the 
problem the complicating factors of the crossed-extension reflexes; (2) the 
intensity of the CS used, which was so strong as to initiate a series of masking 
generalized responses; and (3) the method of observation, which could not 
detect a minute insertional CR response. 

It is probable that the variation in type of contralateral responses (1.e., 
reflex twitch or extension) recorded in the Kellogg et al. study is a corollary 
of applying the CS to the contralateral leg. In spinal preparations the 
nature of bilateral responses is materially affected by varying conditions of 
tonus patterns in the bilaterally responding members at the time of stimu- 
lation. In both acute and chronic preparations there are inherent complex 
functional neural patterns involving both flexor and extensor components 
which are alternately dominant and which depend materially upon the 
states of tonus in contralateral legs at the moment responses are initiated. 
An example of these has been observed in the unilateral progression response 
which occurs and continues for several seconds after a single instantaneous 
relatively intense stimulus has been applied to one hind leg of a spinal 
preparation in which contralateral motor roots have been severed (2). 

It is unlikely that testing the chronic animals in an upright position 
adversely affected the results, since the spinal CR was observed in acute 
spinal animals in upright, inverted, and free-lying preparations. It is im- 
possible at this time to tell whether or not the fact that in the Kellogg 
experiment CS and UCS were never presented simultaneously too greatly 
increased the difficulty of the problem for the spinal preparations. 

That spinal conditioning in most preparations follows a definite pro- 
gressive pattern is a fact. It is also true that spinal preparations are 
encountered in which prolonged training fails to produce a single CR. 
However, it is not uncommon to find a normal animal which fails to learn a 
simple differential flexion response in hundreds of conditioning trials. It 
did not, therefore, seem too surprising when, in the original experiment at 
Illinois (3), 121 of 219 spinal preparations were found to be unconditionable 
under the terms of the experiment. 
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The Kellogg et al. study is interesting as an attempt to produce condi- 
tioning in the spinal dog. However, because of certain aspects of the 
technique employed and the limited number of spinal preparations observed, 
the results do not appear to be conclusive. In fact, it is questionable if the 
technique they described constituted a learnable situation for spinal prep- 
arations and whether, if the spinal CR had occurred in their experiment, 


it could have been observed by the recording technique used by Kellogg, 
Deese, Pronko and Feinberg. 
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IS ‘SPINAL CONDITIONING’ CONDITIONING? 
REPLY TO ‘A COMMENT’ 


BY W. N. KELLOGG 


Conditioning Laboratory, Indiana University 


Many thanks to Dr. Shurrager for the fair and friendly manner in which 
he has evaluated our work (2). 

We hope it is clear to the reader of these remarks that Kellogg, Deese, 
Pronko, and Feinberg (1) have no wish to discredit the striking discovery 
which the phenomenon of spinal conditioning appears to be. That the 
spinal muscle-twitch is an established fact and that under certain circum- 
stances it increases and decreases in frequency there can be no doubt. The 
only question that we raise concerning it is whether the phenomenon is one 
of conditioned learning. Our own experimental work in this direction can 
hardly be regarded as ‘conclusive’ as Shurrager has already indicated (2), 
and we have no intention of implying that we consider it so. The most it 
can do, I think, is to point out the possibility that interpretations other than 
‘conditioning’ can be applied to changes in the frequency of occurrence of 
the muscle-twitch phenomenon. 

Perhaps a word concerning the differences between chronic and acute 
laboratory preparations would be in order at this point. Research workers 
would generally agree, I believe, that for the study of long-time processes— 
as, for example, for the study of learning, conditioning, or retention—one 
chronic preparation should be worth a good many acute preparations. The 
principal reasons for employing acute preparations in such research would 
probably be either (a) that the mutilation was so severe that the organism 
could not survive, or (b) that the phenomenon to be observed could not be 
demonstrated in the chronic state. The extreme difficulty of keeping and 
caring for the seriously debilitated specimen makes the successful mainten- 
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ance of even one chronic animal, in some instances, a laboratory achieve- 
ment. So it is perhaps fairer to compare our work with that of Shurrager 
and Culler in terms of some measure of the extensiveness of the two experi- 
ments, as for example, in terms of the total number of trials given, in terms 
of the length of time or amount of labor expended in the two investigations, 
or by some measuring device other than the number of cases—since the 
cases are not at all the same kinds of cases in the two instances. 

If what appears to be spinal conditioning is really conditioning, then | 
think we must have more adequate explanations of the following points: 


1. Since it takes from 100 to 400 trials ! to condition the normal dog to 
the point of 100 percent efficiency, how can one explain the fact that the 
hindquarters of the acute spinal animal by themselves reach 100 percent 
efficiency within 40, 30, or even 20 trials of conditioned-reflex training (3, 
pp. 142-3)? 

2. If, as Shurrager admits (2), a spread of the muscle-twitch reaction 
to other body parts occurs with intense stimulation, what justification is there 
for calling muscle-twitch responses to mild stimulation—when there is no 
spread, or when the spread is unobservable—‘conditioned’ responses? 

3. The reactions of our own animals were graphically recorded on every 
trial and so were subject to careful examination and rechecking. Even 
under these circumstances there were often doubtful or questionable 
responses. It now appears that the reactions of Shurrager’s preparations 
were, in some instances, not graphically recorded at all, but were observed 
visually by ‘close inspection’ (2). Our experience has been that a good 
graphic recording technique, if it is possible to use one, is far superior to 
direct visual observation. Is it not possible—if all the responses of Shur- 
rager’s animals had been graphically recorded—that the frequencies with 
which the spinal muscle-twitches were found to occur would have turned 
out to be different? 

4. Why does the phenomenon occur in less than half of the cases studied? 
This fact is explained by Shurrager as a kind of analogue of the variability 
of the rate of learning of normal dogs. ‘‘It is not uncommon to find a 
normal animal which fails to learn a simple differential flexion response in 
hundreds of conditioning trials,’’ he writes (2). In the Indiana Conditioning 
Laboratory in nearly 10 years of experimental work with dogs, we have 
never found a normal animal which was unable to develop a simple flexion 
CR within 400 trials or less. We are puzzled by the term ‘differential’ in 
this connection. 

5. Shurrager also states that his ‘“‘efforts to demonstrate conditioning 
in chronic spinal dogs were unsuccessful” (2). Does such a statement, 
coupled with the results reported by Kellogg, Deese, Pronko, and Feinberg 
(1), mean that conditioning can occur only in the acute, and never in the 
chronic spinal animal? If this is the case, one is prone to raise the question 
of whether spinal conditioning, after all, may not be a post-operational 


1When both conditioned and unconditioned stimuli are electric shocks, and the tempora! 
pattern of the stimuli is like that used in spinal conditioning, the total number of trials required 
to reach the 100 percent level in the intact animal is usually closer to 400 than it is to roo trials. 
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artifact, due possibly to the wearing off of the anesthetic, to recovery from 
surgical shock, or to some other similar factor. 


Certainly additional observations are needed before we can agree upon 
the true nature of the phenomenon. More research is planned on this 
question in the Indiana Conditioning Laboratory. It is possible, of course, 
that we shall ultimately succeed in demonstrating spinal conditioning in all 
of the aspects reported by Shurrager and Culler. Up to the present, 
however, we have been unable to do so. With luck we may be able to 
report further upon the matter, at some later time. 


REFERENCES 


1. Kettoce, W. N., Degse, J., Pronxo, N. H., & FemnperG, M. An attempt to condition the 
chronic spinal dog. J. exp. Psychol., 1947, 37, 99-117. 
. SHURRAGER, P. S. A comment on ‘An attempt to condition the chronic spinal dog.’ J. exp. 
Psychol., 1947, 37, 261-263. 
3. SHurRaAGER, P. S., & Cutter, E. Conditioning in the spinal dog. J. exp. Psychol., 1940, 
26, 133-159. 


to 


i 


DISCUSSION 
READING AND THE RATE OF BLINKING 


BY MATTHEW LUCKIESH 


Director, Lighting Research Laboratory, 
General Electric Company, 
Nela Park, Cleveland 


A number of investigators, employing various test procedures, have used 
the blink-rate as a measure of visual fatigue, with the result that doubt has 
been cast upon it as a valid criterion. A study of the published papers 
indicates that when similar test procedures were used, the experimental 
results agreed qualitatively. However, when the test procedures did not 
correspond, there is little correlation between the results. 

In the investigations of the author and his colleagues, the task of reading 
is reduced to its simplest form. We used adult Ss who are used to reading 
steadily and continuously, that is, naturally and automatically. There was 
nothing novel or unnatural in the common task of reading. The Ss were 
not required to perform other tasks during the reading period because of the 
possibility of destroying their mental or physical equilibrium. Only one 
known experimental variable, such as level of illumination, size of type, 
style of type, etc., was involved in each of our investigations. The only 
interruption was the turning of pages of the book, which the S did himself, 
and which is a normal incident of reading. The text was selected for its 
uniformity of interest and for its low and uniform emotional value. In other 
words, the act of reading was controlled only to the extent necessary to in- 
sure that it would be a normal and steady task and a factor as constant as 
possible during the investigation, with a minimum effect upon the criterion, 
so that the effect of the experimental variable could be obtained. Com- 
plete control was exercised over the environmental factors by making them 
quiet, comfortable and inconspicuous. Distractions were minimized, 
including distraction of the test procedure. S was not under any ex- 
traneous physical or mental strain not involved in reading naturally. 

As a contrast to the above, Bitterman (1, 2) made considerable use of 
the Minnesota Vocational Test for Clerical Workers, which requires the Ss 
to check paired names and number groups for identity and non-identity. 
While it is not specifically stated, it is understood that the Ss were instructed 
to work rapidly and accurately. In two investigations the experimental 
variables were (a) size of type (6- and 12-point) and (b) alternating periods 
of silence and distraction from a phonograph recording. In neither test 
were significant results obtained. Tinker (10) favors his own Speed-of- 
Reading Test to compare lower case with all-capital printed matter, in 
which the S is required to indicate orally one word in each of a series of 
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sentences which spoils the meaning. These investigations differ so basically 
from ours that it is somewhat surprising that psychologists would seriously 
compare the respective results. 

We have taken great care in planning our tests so as to eliminate as much 
as possible various diverse physical, physiological, and psychological 
factors that influence the rate of involuntary blinking. Some of these 
effects have been adequately demonstrated. Extraneous factors may either 
suppress or augment the normal blink-rate of an S, with the result that the 
effect of the experimental variable may be completely lost. Psychologists, 
especially, should be aware of this, for they usually are careful to minimize 
such effects. Peterson and Allison (7) found that different sorts of mental 
preoccupation such as (a) reading and (b) cancelling certain letters in a 
printed cancellation sheet, or (c) near fixation and (d) distant fixation, 
caused about a two to one change in the blink-rate. Telford and Thompson 
(8) found the blink-rates during a five-min. period for mental arithmetic, 
reading, and conversation to be 54.43, 14.36 and 44.47, respectively. Duke- 
Elder (3), Luckiesh and Moss (5), and others have obtained similar results 
for tasks that require various amounts of visual and mental concentration. 

Obviously, the statistical reliability of the data is important. ‘Tinker 
and Bitterman frequently refer to McFarland, Holway, and Hurvich (6) 
who used three Ss, three levels of illumination, and only one set of data for 
each S. Data of this type are anything but reliable. A salient character- 
istic of Tinker’s, Bitterman’s, and Hoffman’s work is the small number of 
repeat tests by their Ss. Often only one test was made. To a limited 
extent this is countered by a large number of Ss. However, there is no 
assurance that the S will perform the same way a second time. ‘There 
should be some explainable reasons other than questioning the validity of 
the criterion. 

Two investigations of reading for a period of time are directly comparable 
with one conducted by the author and his colleagues. ‘The corresponding 
tests are summarized as follows: 


1. Luckiesh and Moss (5). Eleven Ss read good printing for an 
hour under a level of illumination of 10 footcandles. The blinks were 
counted during the first and last five-min. periods. Each S repeated 
the test once. 

2. Tinker (9). This is the only test conducted by Tinker that can 
be compared with ours. A group of 74 Ss read well-printed material 
for a half hour under Io footcandles, the number of blinks being counted 
for each five-min. period. The test was repeated with another group 
of 64 Ss. 

3. Hoffman (4). Thirty Ss each read good printing for a single 
four-hour period under 10.5 footcandles. Eye movements, including 
blinks, were recorded electrically during the first five min. of the reading 
period and thereafter during the last five min. of each half hour. 


The results of the above tests are summarized in Table I. These 
results are in close agreement, indicating that a correlation can be obtained 
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TABLE I 
Comparison OF Biinkinc Frequencies In Four 
Number of Blinks 
Percent 
Increase 
First 5 Min. Last 5 Min. 
Luckiesh and Moss 
11 Ss, two sittings, one hr. reading 35 45 31 
Hotfman 
30 Ss, one sitting, one hr. reading 34.3 43.6 27 
Tinker 
74 Ss, one sitting, 4 hr. reading 18.7 23.4 25.2 
64 Ss, one sitting, 4 hr. reading 25.2 34.3 36.2 


when the experimental procedures are similar. Hoffman found a rather 
steady increase in the rate of blinking throughout the entire four-hour 
period, the increase at the end being 60 percent. The plotted data indicated 
that the rate of increase was diminishing and that longer periods would not 
necessarily produce a much greater blink-rate. While Tinker used only 
half-hour reading periods, the increase during the reading period was of 
the same order as the other two investigators found for one-hour reading 
periods. 


When the experimental procedure involves not normal reading but a 
critical examination of printed matter with the attendant requirements for 
speed and accuracy, the observed blink-rates usually are inconclusive. 
This suggests that the severity of the visual task introduces factors which 
neutralize any changes in rate that might be caused by the experimental 
variable. Therefore, none of the tests conducted by Tinker, Bitterman or 
Hoffman detracts from the validity of the blink-rate as a criterion of ease 
of seeing. 
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FREQUENCY OF BLINKING IN VISUAL WORK: 
A REPLY TO DR. LUCKIESH 


BY M. E. BITTERMAN 


Cornell University 


The foregoing note by Dr. Luckiesh represents a considerable modifica- 
tion of his position with respect to the value of frequency of blinking in the 


study of visual fatigue and efficiency. Consider, for example, the following 
passage from his latest book: 


. . » We began a decade ago a series of systematic researches which yielded many 
useful data. . . . All the data combined have established the rate of involuntary blinking 


as a sensitive criterion of ease of seeing, provided representative groups of subjects are used 
under carefully controlled conditions. 


The blink-rates of various individuals differ widely and are different for different tasks. 
One person may blink several times faster than another person under identical conditions. 
Furthermore, the blink-rate of an individual depends upon what he is doing. However, 
these differences are of no consequence when a specific task, such as reading, is chosen and the 


same subjects are used for studying the relative effects of two or more conditions (5, pp. 205- 
206).! 


Investigations with the Minnesota Clerical Test (2) and the Tinker Speed 
of Reading Test (7) which Luckiesh criticizes, although they involved 
visual tasks different from the one he employed, nevertheless provide a clear 
refutation of this broad claim. 

Luckiesh now appears to support the validity of blink rate only as an 
index of the ease of ‘normal’ and ‘continuous’ reading—although his failure 
to check up on the performance of his Ss leaves him with no way of knowing 
whether their reading is either normal or continuous. In support of this 
more restricted position, Luckiesh asserts that whenever conditions similar 
to his own have been employed, the experimental results have agreed 
‘qualitatively’ with those he has obtained. This assertion appears to the 
writer to represent both a misinterpretation and an unwarranted selection 
of evidence. 

The studies cited by Luckiesh are those of Hoffman (4) and of Tinker 
(6). While it is true that Hoffman, working with 30 Ss, found a reliable 
mean increase in blink rate (9.3 blinks/five min.) in the course of one hour 
of reading, the mean increases reported by Tinker for two groups of Ss 
(Ni = 74, Ne = 64) were not reliable (CR; = .32, CR: = .42).  Luckiesh’s 
remarks on the question of statistical significance are not very clear. It is 
interesting to note, however, that in his opinion “‘there is no assurance that 
the S will perform the same way a second time.”’ Can this comment be 
taken to mean that blink measurements are not very reliable? 

Luckiesh fails to cite two studies which, although as similar to his own 
as Hoffman’s or Tinker’s, did not provide support for his position. ‘The 
first, by Bitterman (2), justified the acceptance of a null hypothesis with 


1 Italics mine. 


